Had the same thing happen to me. Simple problem: find out if two strings are ana...

leongrande · on April 24, 2018

You need not to sort the strings. Create a vector with indices he ascii codes, incrementing for the first string and decrementing the count for the second, and keep a count of the number of chars, if you get a negative number exit false, else if the count of chars is zero then each one is an anagram of the other. (n+m+128) operations n and m are the length of both strings and 128 for creating the vector

trevor-e · on April 24, 2018

Interviewer: "OK, how about with a unicode string?"

Tyr42 · on April 24, 2018

That sounds a lot like doing a radix sort of the strings, then comparing the run length encoded, sorted strings ;p

mindhash · on April 24, 2018

on a tangent, one way could be assign a number code to each alphabet. Add the numbers that occur in the strings. IF the sum matches, they are anagrams.

aquateen · on April 24, 2018

I don't see how the sum would be unique to a particular combination of letters.

tylerhou · on April 24, 2018

Works if you assign prime numbers to each letter and multiply instead. So a=2, b=3, etc.

tuukkah · on April 24, 2018

To get decent anagram lengths and complexity, implement the numbers as a dict of repetitions of primes, and implement the multiplication by summing the repetitions. ;-)

reinhardt · on April 24, 2018

In which case you can just compare the dicts without performing the multiplication (which happens to be the costliest part for arbitrary-precision integers).

tuukkah · on April 25, 2018

Exactly.

jcadam · on April 24, 2018

I briefly mused about just summing the ASCII codes for each letter in the strings. But quickly discarded the idea for this reason :)

LyndsySimon · on April 24, 2018

“ad” = “bc”

You’re right.

D_Alex · on April 25, 2018

What if a=1, b=10, c=100 - etc? Assuming the strings were English words...

nomel · on April 24, 2018

You would have to make sure the sum of any combination of all characters was unique. For example, if the code was the character number, a=1, b=2, etc, both "abc" and "bbb" would have the same sum.

So I think something silly like: character_code = len(string)*len(alphabet)^character_index should work.

pg_bot · on April 24, 2018

You need to multiply the numbers, (and they must be prime) not add them.

jgforbes · on April 24, 2018

a=1, b=2, c=3, ...

ac = 1 + 3 = 4 bb = 2 + 2 = 4

eecc · on April 24, 2018

I guess the numbers should be primes... 4+1=3+2

shagie · on April 25, 2018

10 = 5 + 5 = 7 + 3

eecc · on April 25, 2018

that's right... even prime isn't enough

shagie · on April 25, 2018

Prime is ok if it’s multiplied (there is one and only one prime factorization of a number).... but that can get to absurdly large numbers.

Still, consider the word ‘abe’ with a = 2, b = 3, c = 5, d = 7 and e = 11.

abe would then be 2^1 * 3^1 * 5^0 * 7^0 * 11^1. ‘abba’ would be 2^2 * 3^2. Each anagram would have a distinct value.