I just finished HsNub, and while it works basically, its not all that I was hoping for. The basic problem is that lots of functions which are entirely different have the same bytecode! For example, selecting the first element out of a data structure is given as lots of different names, often field selectors. To be useful, it would need to incorporate some knowledge of the types as well.
Just to show an example of it running, when trying it on my regular expression library, tweaked to add two functions func1 and func2 that are just like id, I get the result:
{Data.Char-cont, YHC.Internal-_id, Prelude-id, Prelude-Prelude.Enum.Prelude.Char.fromEnum, Prelude-Prelude.Enum.Prelude.Char.toEnum, Prelude-Prelude.Integral.Prelude.Integer.toInteger, Prelude-Prelude.Num.Prelude.Integer.fromInteger, RegExpData-func2, RegExpData-func1}
(plus plenty of other equal functions)
So it works, but its just a bit too verbose, and your program must type, whereas the bytecode doesn't. Its also really slow, but I'm sure that could be fixed with a bit of effort.
One thing I did find is that a massive amount of the standard libraries are the same. Some statistics about the functions in the prelude:
1826 functions
701 distinct bytecode sequences (38%)
45025 bytecodes, 27584 of which are required (61%)
1363 distinct constant tables (75%)
4732 constant entires, 4512 of which are required (95%)
Maybe a better bytecode format would store each function as a pointer to a bytecode chunk, and a pointer to a constant table - that way different ones could be mixed and matched. I don't expect the standard compiler to do this (way too expensive, harms debugging), but its useful to be able to do as a standalone optimiser, and nicely cuts away at the memory required.