Friday, August 8, 2014

Problem 98 - C++

Problem 98 was quite the problem. I know for a fact I have never solved it before, as I distinctly remember not being able to properly parse what the problem meant the first time I looked at it. Though this time I certainly knew what the problem was, this problem was no slouch to solve. This problem was a rare case of Project Euler problems where there is almost no complicated math going on at all, its all just a matter of figuring out how to best program a solution. And though I will say that programming this solution was a bit frustrating at times, it was pretty fun and I had a good time doing it.

My first issue was a very odd one - I was trying to use a words.txt I had probably downloaded for an earlier Euler problem, while looking at the version online to determine the format. The formats were different, so I got some segfaults reading the output.

The next issues were from thing just taking too long in the massive cluster of loops that you can see if you look at the problem. Putting lots of early termination conditions fixed this problem.

Next, I was faced with problems that were really weird to debug and I did not know what was going wrong. Turns out, the issue was I had a loop going backwards in a string thusly:
for (size_t i = str.size() - 1; i >= 0; --i). Though size_t might be the "proper" type for indexing a string or other container, it is not the proper type for comparison against 0. size_t's are unsigned, so i >= 0 is always true, and --i causes underflow and this loop goes on for a while (I saw a while, not infinitely, as the issue was that this was in the getStrMapping loop, which has an earlier termination condition, which, when in an infinite loop, is always hit!).

After that, the problems were relatively small.

So, despite a lot of bumps in the road, this was a pretty fun problem. I ended up using a lot of C++ standard library data structures and other features, which was nice, not only because I have been writing a lot of C++ code for the Data Structures class I am TAing this upcoming fall, so I was well practiced/this was further practice, but also this problem is finally a nice C++ entry for this blog that is distinctly not just C.

Also worth noting is that my solution to this problem ended up having far more comments than my average solution. Maybe this is a result of pent-up frustration with languages that I don't even know the comment syntax in, or maybe I commented because I was just having so much fun, or maybe this problem was just so mentally stimulating that I had to comment, or maybe the length of time I spent on this problem caused me to go insane with an insanity whose symptoms include commenting your code. But anyway, here is my solution to problem 98:

Code runs in .2s on my machine (.6s if I dont compile with g++ -O3)
#include <stdio.h>
#include <vector>
#include <string>
#include <cmath>
#include <unordered_map>
#include <algorithm>
// --note: requires compilation with g++ -std=c++11
//
// Some of my other C++ solutions were basically written in C
// I think the includes above probe that that is not the case for this problem

void getSquaresOfSize(int size, std::vector<long>& squares) {
    squares.clear();
    long low = (long) ceil(sqrt(pow(10.0, size - 1.0)));
    long high= (long) ceil(sqrt(pow(10.0, size)));
    for (long i = low; i < high; ++i) {
        squares.push_back(i * i);
    } 
}

// Get mapping of string, and return whether it is valid
bool getStrMapping(const std::string& str, long square, std::unordered_map<char, int>& map) {
    map.clear();
    //std::unordered_map <int, char> invmap;
    //^this was dumb
    char invmap[10];
    // ? for a debug print test, not just to be cute
    for (size_t i = 0; i < 10; ++i) { invmap[i] = '?'; }
    // Back-to-front to agree with easy order of grabbing digits
    // int because unsigned size_t will underflow and i >= 0 will always return true!
    for (int i = int(str.size() - 1); i >= 0; --i) {
        int digit = square % 10;
        if (invmap[digit] != '?' && invmap[digit] != str[i]) {
            return false;
        }
        invmap[digit] = str[i];
        map[str[i]] = digit;
        square /= 10;
    }
    return true;
}

// A return value of 0 indicates there is no square anagram with the mapping
// any other return value is the square produced by a mapping
// this method assumes anagram-ness
long squareMatch(const std::string& str, const std::unordered_map <char, int>& map) {
    long maybeSquare = 0;
    for (size_t i = 0; i < str.size(); ++i) {
        if ( i == 0 && map.at(str[i]) == 0) {
            //Will produce a number with a leading 0!
            return 0;
        }
        maybeSquare = (maybeSquare * 10) + map.at(str[i]);
    }
    long root = (long)round(sqrt((long double)maybeSquare));
    if (root * root == maybeSquare) {
        return maybeSquare;
    } else {
        return 0;
    }
}

// Thank you stack overflow and the standard library
// http://stackoverflow.com/questions/18267407/check-whether-two-strings-are-anagrams-c
// note pass-by-value important here
bool isAnagram(std::string s1, std::string s2) {
    std::sort(s1.begin(), s1.end());
    std::sort(s2.begin(), s2.end());
    return s1 == s2;

}
long max(long x, long y) { return x > y ? x : y; }

void getAnagrams(const std::string& str, const std::vector<std::string>& strings,
        int start, std::vector<std::string>& anagrams) {
    anagrams.clear();
    for (size_t i = start; i < strings.size(); ++i) {
        if (i != 0 && isAnagram(str, strings[i])) {
            anagrams.push_back(strings[i]);
        }
    }
}

long maxAnagramForSize(int size, const std::vector<std::string>& strings) {
    std::vector<long> squares;
    if (strings.size() < 2) {
        //abort early as not to compute stupidly large squares
        return 0;
    }
    getSquaresOfSize(size, squares);
    std::unordered_map<char, int> map;
    long maxFound = 0;
    //This mass of loops doesn't look scary at all
    std::vector<std::string> anagrams;
    for (size_t i = 0; i < strings.size(); ++i) {
        getAnagrams(strings[i], strings, i+1, anagrams);
        if (anagrams.size() > 0) {
            for (size_t j = 0; j < squares.size(); ++j) {
                if (getStrMapping(strings[i], squares[j], map)) {
                    for (size_t k = 0; k < anagrams.size(); ++k) {
                        long match = squareMatch(anagrams[k], map);
                        if (match > 0) {
                            maxFound = max( max(maxFound, match), squares[j]);
                        }
                    }
                }
            }
        }
    }
    return maxFound;
}

int main (int argc, char ** argv) {
    
    std::vector<std::vector<std::string> > strings;
    // Overestimate max length
    strings.resize(20);
    for (size_t i = 0; i < strings.size(); ++i) {
        strings[i].resize(0);
    }
    char* word = NULL;
    size_t length;
    while (getdelim(&word, &length, ',', stdin) > 0) {
        const std::string& s = word;
        //pesky quotes
        const std::string realS = s.substr(1, s.size() - 3);
        strings[realS.size()].push_back(realS);
    }
    long ans = 0;
    for (size_t i = strings.size() - 1; i > 0; --i) {
        ans = maxAnagramForSize(i, strings[i]);
        if (ans > 0) {
            //Clearly an n-digit number is larger than a
            //k < n digit number
            break;
        }
    }
    printf("%ld\n", ans);
    return 0; 
}

No comments:

Post a Comment