Category: math


posted by ben on 10.11.01 at 14:14, null, null, math, math, rant, politics, rant, rave, Leave a comment Permalink

So, so wrong. We need more math education. Yes, set theory may not be terribly useful (unless you interact with tuples on a daily basis), but calculus is a prerequisite for understanding the world around us. If we want to understand how many of the automated systems around us work, linear algebra and more is neccessary. The math education that comes with a typical college degree is totally insufficient to understanding the world.

That means only a select few can understand how things work, which in turn means that an even smaller select group can improve on the working of things. This is a problem. Wider math education means the ability of society to build more complex machines. That means great productivity and that means greater wealth.

Some education has value. Some does not. Math has value.

Saturday night in Boulder is exciting.

posted by ben on 07.07.07 at 20:24, null, math, math, ben tries to beat Livermore, Leave a comment Permalink

Histogram of highPrice(time)/openPrice(time) for AMD from 2002-2003. Note the spike at 1, or days on which the open price was the high price. On 26% of days, the high price is less than 1% greater than the open price. About 8% of the time, the high price is the open price.

On 52% of days, the high price is 2% greater than the open price. Does anything distinguish these days?

40% of the time, the high price never hits that 2% mark and the close price is less than the open. I don't like these days. I need to figure out how to avoid them. I suspect this is impossible.

Thinking about time series...

posted by ben on 07.07.07 at 17:51, null, math, math, Leave a comment Permalink

It's a good solution, the only problem is that it depends on data from the future.

A good metric is hard to find.

posted by ben on 07.07.07 at 17:47, null, kittens, math, math, ben tries to paint, Leave a comment Permalink

I don't understand how game theory and information theory fit together. All this momentum trading stuff has game theoretic aspects... for instance, should you sell with a 3% gain, or hold an extra few days for a 5% gain with some risk? I don't really know how to evaluate that, so I just do everything empirically... which is stupid.

Then, there are all these information theory questions, like given equity x went up 4% over the last few days, what is the probability of a 1% gain tomorrow... which seems like a notion that's tugging lightly as ideas like mutual information... since the question is really: is this Brownian motion, or something more systematic? Mutual information doesn't work as a metric, because the space the input series exists in is so huge that there's going to be lots of information, even though, for the intuitive notion I'm thinking of, there is in fact very little.

A good metric is hard to find.

So, I can evaluate those probabilities, find a bunch of joint probabilities, but how the hell does that fit in with these cutesy little strategies like holding a few extra days until the market wanders back to a profitable place?

That, and I don't know how to cross validate with time series data. I've been inching my way forward, giving it another year more data every once and a while to see what happens, which usually shows me that I've been over fitting like crazy. You can't test over different stocks, because they are incredibly dependent on one another.

Ack. I'm such a poser.

Crazy Sphere Packing

posted by ben on 06.10.16 at 13:40, null, math, math, Leave a comment Permalink
-Ronny Peikert

Bootstrapping Training Data

posted by ben on 05.07.30 at 17:06, null, math, math, Leave a comment Permalink

Have any of you seen anything like this?

I'm thinking it'd be cool to take a supervised set (indicators and labels) and an unsupervised set (indicators) and try to generate a larger supervised set by clustering. It seems really easy, but I can't find any papers. Clearly there are overfitting issues, so maybe the value added isn't that great... but if you could use independent algorithms maybe you could get a boosting effect and drive the misclassified bootstrapped examples to zero... yes? maybe?

Get Rich Slow Schemes

posted by ben on 05.07.29 at 17:42, null, null, math, 1 comment Permalink
Phone Browser Portal

It could run on server end or could function as a portal to garner traffic. It would store cookies for different devices. For simple phone browsers, it'd be easiest to strip out all the tags and pictures. For real browsers, it would be good to resize the page to the browser windows. If the browser supports horizontal scrolling, it might be best to keep the page at 800 or 1024. Surely this already exists... right?

Time Series Stock Analysis

Pick a stock with a steady mean but high variance over a small period. First, do the math to see if given trade times and overhead, it's even possible to make this work. If so, throw all the weka stuff at random walks, biased random walks, data, new data and finally the real world. I need a book on time series. SVM regression is for sisses.

Network Flow

latency is a function of bandwidth, latency-1 and load. There is a vicious cycle in that as load increases with a constant bandwidth, latency goes to infinity... assuming multiple entry points. Of course, I'm picturing this on a highway, not a LAN, so the wired metaphor may mean nothing.

As traffic increases, the speed at which it travles must increase to hold the time to arrival constant... or something close to that. I need a book on network flow.

Comment from: ben [Member] ·
What I was trying to say:

As traffic increases, the throughput must increase to hold travel time constant. But, the system resolves the stress by stopping moving, not correcting.

Usually systems react to stress in a way that reduces said stress. A rubber band distributes load in this way. With the traffic, however, the opposite happens. The systems reacts to the stress in a way that increases the stress.
Permalink 07/30/05 @ 19:06

machine learning

posted by ben on 05.06.17 at 01:32, math, math, 1 comment Permalink

Some models may have a higher potential complexity, but that additional potential is irrelevant if it is dificult to chose model parameters.

Comment from: Other Graham [Member] ·
I think I've got it: the same law holds for pop stars. Gwen Stefani may have a higher potential complexity than Christina Aguilera, but that's irrelevent when she can't decide whether she's a Harajuku girl or not.
Permalink 06/17/05 @ 10:04

If you search for NIPS, you find nipple clips.

posted by ben on 05.05.14 at 00:44, math, math, Leave a comment Permalink

My proto NIPSian paper: here (600k)

Strawberries Forever

posted by ben on 05.04.24 at 22:47, math, math, Leave a comment Permalink

I've been spending way too much time thinking about strawberries lately. I want this job. I now have 2 algorithms. One gives the optimal solution (except in the third case), but takes about 5 minutes to run. It uses a profoundly uninspired conglomerative clustering method.

The other gives a crap solution but takes somewhere from 0.4 seconds to eternity to run... you can choose. And the solution gets better the more you run it. This uses spectral clustering. If I were smart, I could think of a distance metric that imposes boundary conditions on the covering.

The question is, how long should the algorithm run... what is the proper trade off. This run took about 8 sec and seems both ok and typical of an 8 sec run. Here's the input data. Should I run it a longer or shorter time?

Strawberries are growing in a rectangular field of length and width at most 50. You want to build greenhouses to enclose the strawberries. Greenhouses are rectangular, axis-aligned with the field (i.e., not diagonal), and may not overlap. The cost of each greenhouse is $10 plus $1 per unit of area covered.

Write a program that chooses the best number of greenhouses to build, and their locations, so as to enclose all the strawberriess as cheaply as possible. Heuristic solutions that may not always produce the lowest possible cost will be accepted: seek a reasonable tradeoff of efficiency and optimality.

Your program must read a small integer 1 ≤ N ≤ 10 representing the maximum number of greenhouses to consider, and a matrix representation of the field, in which the '@' symbol represents a strawberry. Output must be a copy of the original matrix with letters used to represent greenhouses, preceded by the covering's cost. Here is an example input-output pair:

Input Output

In this example, the solution cost of $90 is computed as(10+8*3) + (10+7*3) + (10+5*3).

QED bitch.

posted by ben on 05.04.06 at 18:00, math, math, 1 comment Permalink

The Economist has an article claiming computers will be doing math in 20-50 years.

"Why should the non-mathematician care about things of this nature? The foremost reason is that mathematics is beautiful, even if it is, sadly, more inaccessible than other forms of art."

First: Math isn't art. It's demoralizing rigorous work. It drives men like Godel and Erdos mad (assuming the mad aren't simply drawn to math). If, through the hand of providence, that doesn't happen, then they end up demoralized like Hardy. A life in math is a life of suffering. A life in art involves wine, love and the beautiful people.

Second: Sure, I can see computers filling in Flickr like meta data. They're good at that sort of mundane stuff, even if the algorithms aren't quite there. But proving things? No. The four colours proof wasn't done by computers. Computers only crunched some finite cases. They didn't come up with those cases. When we have a computer that can structure a proof, then I'll believe.

On the other hand, this proof automating software is cool. I want to use it for something. The manual for HOL:

Comment from: collin [Member] ·
I, of course agree with Ben, as I am already mad or will be soon. And as usual the article is somewhat sensationalist. Of course all their examples are simple a computer checking some large finite number of cases. Wait, I thought I had something interesting to say...

I found this to be quite interesting:
Although the Annals will publish Dr Hales's paper, Peter Sarnak, an editor of the Annals, whose own work does not involve the use of computers, says that the paper will be accompanied by an unusual disclaimer, stating that the computer programs accompanying the paper have not undergone peer review. There is a simple reason for that, Dr Sarnak says—it is impossible to find peers who are willing to review the computer code. However, there is a flip-side to the disclaimer as well—Dr Sarnak says that the editors of the Annals expect to receive, and publish, more papers of this type—for things, he believes, will change over the next 20-50 years. Dr Sarnak points out that maths may become “a bit like experimental physics” where certain results are taken on trust, and independent duplication of experiments replaces examination of a colleague's paper.

I find it suprising, to the point of doubiousness, that no "peer" was willing to look at his code. I understand that corectness proof of code/languages is difficult, but really no one would look at it? Then you have the question, what if a bit flipped somewhere on one of the cases? C'est la vie...
You can find the paper here.
Permalink 04/07/05 @ 13:15