Really, you're going to use a logarithm because it fits one data point better?
Besides you might as well just use all your datapoints directly. There's no real need to interpolate between them (and if you really want to optimize your ad costs that finely, don't use a function that predicts -infinty gross profit if you don't use ads).
> The graph above features a linear line of best fit. Clearly we can see above that the data isn't linear, and a linear line of best fit doesn't make sense
I think the reason for not using a linear fit was that it "doesn't make sense", but the reasoning is not given. It would be interesting if the article explicitly said why it doesn't make sense, and why the somewhat arbitrary choice of the log function does. As you point out, gross profit will not be massively negative when there is 0 ad spend.
By my eye, the gross profit as a function of ad spend looks approximately linear in the region where data was sampled: say gross profit = 400 + 0.5*ad spend . If you plug that in & then optimise for total profit the most profitable non negative choice of ad spend is of course zero!
So in this case a structural modelling assumption that isn't well explained or justified (log vs linear vs any other function) has a very large impact on the answer. It seems like we're trying to maximise a function in a region where we don't have observed data.
Since the problem as given is data poor and modelled in a way that is trivial to compute, perhaps it is a reasonable candidate for: running some more experiments to get more data points ; or using a statistical method that can express the uncertainty of our structural modelling decisions & parameter estimation (e.g. a Bayesian analysis starting with a prior distribution of possible fits over a richer class of functions with plausible behaviour near zero) since we don't have a physical theory that justifies a particular functional form of the assumed shape of the relationship.
Besides you might as well just use all your datapoints directly. There's no real need to interpolate between them (and if you really want to optimize your ad costs that finely, don't use a function that predicts -infinty gross profit if you don't use ads).