Sunday, December 30, 2007

Can't wait for Visual Studio 2008!

Quoting Sidu Ponnappa:

... When I hit Ctrl+Tab, I want to see the next darn tab. Not a blessed huge pop-up with a preview of the tab. Of all the silly ideas, why a preview of the tab when you can just show the darn tab instead. ... I'm sure someone somewhere in Redmond who's never used an IDE in their lives looked at Vista's application switcher and thought 'Hey, that's a good idea'. Maybe it is for a Desktop app switcher - but this is (purportedly) an IDE. Jeez. How would you like it if the next time you hit Ctrl tab in Firefox you get this lovely big pop-up covering half your screen with a tiny little preview of the page in there, huh? ...

And Sidu forgot to mention the extra 20 MB of memory wasted on this "usability feature". Well, not all is lost: I'm sure Microsoft will remember to make it optional ;)

Thursday, December 27, 2007

Our beloved friends, the viruses

Quote from a great article by the New Yorker which talks about recent research on the role of ancient viruses on human evolution:

When the sequence of the human genome was fully mapped, in 2003, researchers also discovered something they had not anticipated: our bodies are littered with the shards of such retroviruses, fragments of the chemical code from which all genetic material is made. It takes less than two per cent of our genome to create all the proteins necessary for us to live. Eight per cent, however, is composed of broken and disabled retroviruses, which, millions of years ago, managed to embed themselves in the DNA of our ancestors.

Monday, December 24, 2007

One step at a time ...

One step at a time, we're getting there.

Quoting an article about the results of a two years research aiming at the development of a neocortical column computer simulation (the basic building block of the neocortex):

Markram believes that with the state of technology today, it is possible to build an entire rat's neocortex, which is the next phase of the Blue Brain project, due to begin next year. From there, it's cats, then monkeys and finally, a human brain.

Markram is banking on Moore's law holding steady, as a computer with the power of the human brain, using today's technology, would take up several football pitches and run up an electricity bill of $3bn a year. But by the time Markram gets around to mimicking a full human brain, computing will have moved on.


Computer model of a single neocortical column from a rat's brain (Photo: IBM)

However...

Markram is not holding his breath, waiting for some emergent consciousness to arise from the silicon brain. What he is after is something more prosaic, but also a lot more useful than a talking machine. By understanding the function of the brain, we can also begin to understand its dysfunction.

Thursday, December 20, 2007

Not getting anywhere? You may be missing the point completely

The mathematician David Marr, who pioneered the field of object recognition once said that one does not learn how birds fly by studying feathers but rather by studying aerodynamics.

That's why I'm a bit skeptic about neuroscience in general, where a lot of effort is invested in understanding how neurons work. But some seem to be getting this point.

Monday, December 17, 2007

Reducing day-to-day information noise

Here's a concept I'm trying to apply to my life: reducing signal to noise ratio involved on day-to-day activities (those little things that when summed up suck a lot of time/energy up), like reading newspapers, RSS feed agregattors, actually paying attention to some stuff people around you babble about etc. In summary, where to invest your precious "micro" attention to.

A good argument on the danger lurking under day-to-day noise comes from a passage I found at Fooled by Randomness by Nassim Nicholas Taleb: He makes up the story of an amateur stock investor (a dentist) who checks up his profits online every single minute, thus he's exposed to all the noise on markets, suffering many unnecessary setbacks. Given the fictional portfolio for this dentist, the probability of success at a scale of 1 minute is around 50.17%. At a quarter it goes up to 77% and at 1 year his odds of success are at 93%.

To understand what I mean by noise, take a look at this graph:
and think of the dotted line as the market and the other colored lines as a filtered (de-noised) long term perception of it.

So take news feeds and collectively selected (supposedly high-quality) content from reddit.com, digg.com, news.ycombinator.com and others for example. Look at what you see day after day: What if you simply ignored them and stopped reading. Following them day-by-day (which is something I've been doing for the past several months if not years) is analogue with the dentist investor checking his profits every minute and suffering all those negative setbacks. What if you could reduce it and only catch up on tech news every two weeks? Every month? How much would you really miss?

After some time, those news items that would seem relevant that day or week may actually show up to be irrelevant and just noise and have no real implications when you look at all the developments for this month or semester, so you actually saved some time by not even becoming aware of them in the first place.

There is also the other side of this argument: information noise also has it's advantages, for instance, the sea of unexpected opportunities hidden behind your thousands of unread items on your RSS reader.

So I leave some open questions. What would be the best way for catching up and increasing the signal/noise ratio? That is, what is the best way for sampling from last month tech and software developments history?

Sunday, December 16, 2007

Python wrapper for SUGGEST, a Top-N collaborative-filtering recommendation engine

I'm just releasing this simple wrapper/interface, as it may be useful for others.

In a few weeks I plan on contributing more test scripts and a Windows build.

About the wrapper:
About the wrapped library (SUGGEST):
SUGGEST is a Top-N recommendation engine that implements a variety of recommendation algorithms. Top-N recommender systems, a personalized information filtering technology, are used to identify a set of N items that will be of interest to a certain user. In recent years, top-N recommender systems have been used in a number of different applications such to recommend products a customer will most likely buy; recommend movies, TV programs, or music a user will find enjoyable; identify web-pages that will be of interest; or even suggest alternate ways of searching for information.

The algorithms implemented by SUGGEST are based on collaborative filtering that is the most successful and widely used framework for building recommender systems. SUGGEST implements two classes of collaborative filtering-based top-N recommendation algorithms, called user-based and item-based.

Script GreaseMonkey para agilizar login no Unibanco

Para quem usa o GreaseMonkey (addon Firefox) e é correntista do Unibanco, criei um script para preencher os campos com conta-corrente/etc e redirecionar para a página de entrada de senha.

Para instalar, voce precisa instalar o Greasemonkey. Depois de instala-lo e reiniciar o Firefox, visite novamente a página de instalação do script abaixo e clique no botão em vermelho "Install this script".

Para desinstalar, vá para Tools/Manage User Scripts, selecione o script do Unibanco e clique em Uninstall.

Instalar

Thursday, December 13, 2007

Enterprise social tagging/bookmarking

Here's another idea for a project that has been on my mind for a long time.

It's basically an opensource del.icio.us clone with some additional features that may be useful for enterprise environments.

The deliverable for this project is a server-side engine for companies to host themselves (or pay someone to host it, like the many hosted corporate wikis offerings out there)

Enterprise-specific features may include: enhanced user profile info/metadata, Microsoft Active Directory integration for user sign-on, department/project segmentation for link visibility (security profiles), links and tag suggestion based on company departments or specific projects/groups.

And just like any other half-decent idea, someone else has already implemented but none at the time of this writing seems to fulfil my requirements:

Scuttle (http://sourceforge.net/projects/scuttle/) is an opensource implementation but does not offer enterprise features.

ConnectBeam (http://www.connectbeam.com/) targets the enterprise and has many features, but is not opensource.

This could fit the opensource-but-with-paid-customization-services business model quite well.

If you're interested, check this idea's microPledge entry.

Tuesday, December 11, 2007

Del.icio.us Tag Cleaner

What would a "Delicious Tag Cleaner" be? It is tool for removing unnecessary tags from your del.icio.us account.

It is available at http://delicious.isnotworking.com/

If you're like me, you probably have thousands of bookmarks collected over years and years of web surfing and hundreds of tags used to describe them. But the thing is that over these months/years you haven't been able to come up with a consistent taxonomy for your tags.

I have, for example, dozens of different tags for expressing links related to software development: "dev", "devel", "development" etc.

So this tool can suggest you tags to be merged together, so you can choose one by one and have this tool to merge the chosen tags on your delicious account.

Examples of suggested merges: “book”, “books” and “ebooks” tags into a single “books” tag. “devel” and “development” into only “dev” etc.

Try it and leave some feedback here!

Sunday, December 09, 2007

Quick thought on the Pareto principle and how to be less annoying

The famous 80/20 principle could go like this: 80% of the effects (being annoying to others) can be explained by 20% of your habits.

Many will agree that constantly saying phrases with the word "I" in it is chief among these 20%.

So practice this (I try, but fail many times): think twice if the phrase you want to say about yourself is really relevant to others.

Wednesday, November 28, 2007

Opera 9 on Ubuntu Linux with AMD64

For those Opera fans running Ubuntu on 64-bit AMD cpus:

Opera only provides Ubuntu Linux packages for i386, but there is a way for forcing the installation of the provided i386 package by borrowing a shared library it depends on (Qt) from an i386 Ubuntu package:

1) Download the Ubuntu version of Opera 9 for i386 Linux from the official site.

2) Download the libqt3-mt deb package file for i386. (Try using the search interface if link is broken)

3) Create a temp dir for libqt3-mt contents:
$ mkdir qttempdir

4) Unpack the libqt3-mt deb package with
$ dpkg-deb -x libqt3-mt_3.3.8really3.3.7-0ubuntu5.2_i386.deb qttempdir

5) Copy the i386 shared Qt lib to your /lib32 dir (removing minor version numbers from the filename, or creating symlinks):
$ sudo cp qttempdir/usr/lib/libqt-mt.so.3.3.7 /lib32/libqt-mt.so.3

6) Force the Opera debian package installation with
$ sudo dpkg -i --force-all opera*.deb

Monday, October 29, 2007

Decent tiny urls bookmarklet

Someone requested a decent tiny url generator, and someone did. But as it is, it lacks a bookmarklet for quick one-click decent tiny url generation.

So here it is: DecentUrl! Just drag this link to your browser toolbar.

Known issues:
  • Does not work on Internet Explorer. If you know how to fix it, please drop a comment or email.

Saturday, September 15, 2007

Online book reviews

I love reading, therefore I love books, therefore I love Amazon, therefore I spend a lot of time online browsing for books which I'll probably never buy and if bought probably never read.

One thing I've realized a long time ago is that for some books (especially the controversial ones) the reviews provided by other readers are as valuable as the book itself.

A good example for that are all the reviews for The God Delusion by Richard Dawkins. I don't even think I need to read the book after reading many of the great reviews there.

Here's an example of a guy who provides many insightful reviews.

Of course reviews must be taken with a grain of salt as humans have an unbelievable capacity for being biased without realizing it.

Speaking of useless reviews, take a look at the ones by this guy who has apparently become an expert on providing reviews for books he never read and products he never bought. Some are actually funny!

Sunday, August 26, 2007

What if I could build a huge helium balloon and let it go?


I'm not sure about you, but as a kid I always wondered about how high those helium-filled balloons would reach when you let them go on open air. I say wonder no more! Some guys did it and managed to track its whereabouts and snap some stunning pictures from high up in the atmosphere: SABLE-3 Balloon Launch

Thursday, August 23, 2007

In which over-specification kills

There appears to be some confusion over the new pilot role titles. This notice will hopefully clear up any misunderstandings.

The titles P1, P2 and Co-Pilot will now cease to have any meaning within the BA operations manuals. They are to be replaced by Handling Pilot, Non-handling Pilot, Handling Landing Pilot, Non-Handling Landing Pilot, Handling Non-Landing Pilot, and Non-Handling Non-Landing Pilot. The Landing Pilot is initially the Handling Pilot and will handle the take-off and landing except in role reversal when he is the Non-Handling Pilot for taxi until the Handling Non-Landing Pilot hands the handling to the Landing Pilot at eighty knots. The Non-Landing (Non-Handling, since the Landing Pilot is handling) Pilot reads the checklist to the Handling Pilot until after Before Descent Checklist completion, when the Handling Landing Pilot hands the handling to the Non-Handling Non-Landing Pilot who then becomes the Handling Non-Landing Pilot. The Landing Pilot is the Non-Handling Pilot until the decision altitude call, when the Handling Non-Landing Pilot hands the handling to the Non-Handling Landing Pilot, unless the latter calls "go-around", in which case the Handling Non-Landing Pilot, continues handling and the Non-Handling Landing Pilot continues non-handling until the next call of "land" or "go-around", as appropriate.
British Airways memorandum, quoted in Pilot Magazine, December 1996

Thursday, August 02, 2007

In which I insult lawyers ...

Tiny bit of wisdom from Scott Berkun:
Why software sucks: "When you create you are exercising the greatest power in the universe: bringing something that didn’t exist before into the world. Making something for others is a gift. Few people in the world have the privilege of earning a living by creating things."

Ok, so what does that have to do with lawyers ? Simple: lawyers don't create anything ! They basically make a living by making life more and more complicated and pretending to be solving problems, as opposed to ... ahem ... engineers.

Oh, and this post title was inspired by the highly recommended Wondermark by David Malki.

Thursday, May 24, 2007

Major updates at imgSeek and related projects blog

Not really major updates but what isn't major when the previous value was ZERO ?
Check it out if you're interested on imgSeek, visual search technology and so on ...

Tuesday, May 01, 2007

API Design Guidelines

I keep getting back to some references on the best practices for C++/Java API design so I decided to summarize them all here for future reference.

General design guidelines

widget->repaint();
widget->repaint(true);
widget->repaint(false);
A somewhat better API might have been
widget->repaint();
widget->repaintWithoutErasing();

  • General naming rules: Do not abbreviate. Even obvious abbreviations such as "prev" for "previous" don't pay off in the long run, because the user must remember which words are abbreviated.
  • Naming classes: Identify groups of classes instead of finding the perfect name for each individual class.
  • Naming functions and parameters: The number one rule of function naming is that it should be clear from the name whether the function has side-effects or not. Parameter names are an important source of information to the programmer, even though they don't show up in the code that uses the API. Since modern IDEs show them while the programmer is writing code, it's worthwhile to give decent names to parameters in the header files and to use the same names in the documentation.
  • Naming boolean getters, setters, and properties: Finding good names for the getter and setter of a bool property is always a special pain. Should the getter be called checked() or isChecked()? scrollBarsEnabled() or areScrollBarEnabled()?
    • Adjectives are prefixed with is-. And be consistent: Adjectives applying to a plural noun have no prefix (like are).
    • Verbs have no prefix and don't use the third person (-s):
    • Nouns generally have no prefix. (Generally because sometimes not having an is prefix can be misleading)
  • Write to your API early and often: Code lives on and is understood as examples, unit tests
  • Implementation should not impact API: Implementation details confuse users and inhibit freedom to change implementation. Shield your users from them
  • Minimize visibility of everything
  • Documentation matters: An API is supposed to be read (several times) and understood by others and this kind of people normally don't read all words you write (I know I don't) so be succinct, precise and cover every single class/function.
  • Consider performance consequences of API design decisions: Bad decisions can limit performance (Making type mutable; Providing constructor instead of static factory; Using implementation type instead of interface)
  • Fail fast: Report errors as soon as possible after they occur
  • Avoid long parameter lists
  • Avoid return values that Demand Exceptional Processing: return zero-length array or empty collection, not null
  • Provide programmatic access to all data available in string form, Otherwise, clients will parse strings
  • Overload with care: Avoid ambiguous overloadings. Just because you can doesn't mean you should and it's often better to use a different name
  • Use appropriate parameter and return types:
    • Favor interface types over classes for input: Provides flexibility, performance
    • Use most specific possible input parameter type: Moves error from runtime to compile time
    • Don't use string if a better type exists: Strings are cumbersome, error-prone, and slow
    • Don't use floating point for monetary values: Binary floating point causes inexact results.
    • Use double (64 bits) rather than float (32 bits): Precision loss is real, performance loss negligible
Java specific
  • Supply Interfaces (Good reasons: Callbacks, Multiple inheritance, Dynamic proxies), but keep in mind that interfaces:
    • Can be implemented by anybody.
    • Cannot have constructors or static methods.
    • Cannot evolve.
    • Cannot be serialized.
  • Be careful with packages: The Java language has fairly limited ways of controlling the visibility of classes and methods. In particular, if a class or method is visible outside its package, then it is visible to all code in all packages. This means that if you define your API in several packages, you have to be careful to avoid being forced to make things public just so that code in other packages in the API can access them. The simplest solution to avoid this is to put your whole API in one package. For an API with fewer than about 30 public classes this is usually the best approach. If your API is too big for a single package to be appropriate, then you should plan to have private implementation packages.
  • Avoid Static Methods
  • Avoid making classes and methods sealed or final or non-virtual
  • Immutable classes are good: If a class can be immutable, then it should be.
  • The only visible fields should be static and final.
  • Avoid eccentricity. There are many well-established conventions for Java code, with regard to identifier case, getters and setters, standard exception classes, and so on.
  • Don't implement Cloneable: It is usually less useful than you might think to create a copy of an object. If you do need this functionality, rather than having a clone() method it's generally a better idea to define a "copy constructor" or static factory method.
  • Exceptions should usually be unchecked: Use a checked exception "if the exceptional condition cannot be prevented by proper use of the API and the programmer using the API can take some useful action once confronted with the exception." In practice this usually means that a checked exception reflects a problem in interaction with the outside world, such as the network, filesystem, or windowing system. If the exception signals that parameters are incorrect or than an object is in the wrong state for the operation you're trying to do, then an unchecked exception (subclass of RuntimeException) is appropriate.
  • Design for inheritance or don't allow it: Every method should be final by default (perhaps by virtue of being in a final class). Only if you can clearly document what happens if you override the method should it be possible to do so. And you should only do that if you have coded useful examples that do override the method.

C++ specific

  • Static Polymorphism: Similar classes should have a similar API. This can be done using inheritance where it makes sense -- that is, when run-time polymorphism is used. But polymorphism also happens at design time. Static polymorphism also makes it easier to memorize APIs and programming patterns. As a consequence, a similar API for a set of related classes is sometimes better than perfect individual APIs for each class.
  • Naming Enum Types and Values: When declaring enums, we must keep in mind that in C++ (unlike in Java or C#), the enum values are used without the type. So one guideline for naming enum types is to repeat at least one element of the enum type name in each of the enum values.
  • Pointers or References: Most C++ books recommend references whenever possible, according to the general perception that references are "safer and nicer" than pointers. However, sometimes it's best to use pointers because they make the user code more readable by making it clear that there's a high probability that its parameters will be modified by the function call. By accepting references one could infer from a function call that looks by copy is actually by reference. You can't have this doubt with pointers.

Monday, April 30, 2007

Advanced Python or Understanding Python - Google Video

Advanced Python or Understanding Python - Google Video

Very enlightening: It's a really good 75 minutes investment if you care about Python.

I wish I had seen one like this when I was learning Python since this one covers every single confusing aspect of Python.

Saturday, April 07, 2007

Ant Wasn't Supposed to be a Scripting Language

Another example of a tool that was never planned to be widely used the way it currently is.

Also serves as another example of why a tool should never aim at being the best and solve all problems in order to achieve success.

Tuesday, April 03, 2007

Ian Murdock is joining Sun

Debian creator Ian Murdock is joining Sun.

This is something rare to see these days: a computer industry luminary joining a company other than Google ...

Saturday, March 31, 2007

Why you should treat modern life and nature as the same thing

This is why I love and believe in technology, treating modern life and nature as the same thing:
If man has evolved the ability to override his evolutionary imperatives, then there must have been an advantage to his genes in doing so. Therefore, even the emancipation from evolution that we so fondly imagine we have achieved must itself have evolved because it suited the replication of genes.
by Matt Ridley, in The Red Queen: Sex and the Evolution of Human Nature.

Friday, March 23, 2007

To all credit card issuers in Brazil

Get a clue ! This is the 3rd time I receive an offer in a week. My bank already offers me one with no fees so why on earth are you trying to seduce me with one at 2.55-whatever/month ?

You should be actually wasting marketing money by telling me how much you'd rip me off in comparison with your competition if I don't pay you back monthly ...

Monday, March 19, 2007

Automatic Multi Language Program Library Generation for REST APIs

Development worth watching for:
Automatic Multi Language Program Library Generation for REST APIs

You never know when you're gonna need to formally specify REST APIs and as a by-product generate some example client stubs in multiple languages.

Tuesday, February 27, 2007

Evolving the Spectrum of Distributed Architectures

Great conclusion by Benjamin Carlyle on the The Architectural Spectrum:

I still don't see where SOAP fits into the world, or even WAKA for that matter. The expense of rolling out a new protocol over the scale of the Web has already been demonstrated to be nearly impossible over the short term. HTTP/1.1 and IPv6 are examples. The Web has reached a point where it takes decades to bring about substantial change, even when the change appears compelling. HTTP can't be unmade at this point, but perhaps it can be extended. So long as their use remains Web-compatible, sub-architectures can extend HTTP and its content types to suit their individual needs. They may even be able to build a second-tier Web that eventually supplants the original Web.

I don't see a place for RDF. I see the Web as a world of mime types and namespace-free xml. I think you need to build communities around document types. I think the sub-architectures that (mis)use and extend the content types of the Web contribute to it, and that XML encourages this more than RDF does. Today we have HTML, atom, pdf, png, svg, and a raft of other useful document types. In twenty years time we will probably have another handful that are so immensely useful to the wider Web that we can't imagine how we ever lived without them. I predict that this will be the way to the semantic web: Hard-fought victories over specific document types that solve real-world problems. I predict that the majority of these document types will be based around the tree structure of XML, but define their own structure on top of it. I don't foresee any great number being built around the graph structure of XML, also defined on top of XML in present-day RDF/XML serialisations. If RDF is still around in that timeframe it will be used behind the firewall to store data acquired through standard non-RDF document types in a way that replaces present day RDBMS and SQL.



Saturday, February 24, 2007

Most of my best ideas arrive while showering !

Glad to know I'm not the only one. And I agree with this guy in that the reasons are really obvious: no one is disturbing you, you're relaxed, everything just flows.

Language independent exception handling on SWIG

This is quite useful when wrapping code using SWIG for several languages. By adding this snippet to my interface definition file:


// Language independent exception handler
%include exception.i

%exception {
try {
$action
} catch(string& stringReason) {
const char* sData = (char*)stringReason.c_str();
SWIG_exception(SWIG_RuntimeError,sData);
} catch(...) {
SWIG_exception(SWIG_RuntimeError,"Unknown exception");
}
}


I can do this on C++:


if (dbSpace.count(dbId)) {
throw string("dbId already in use");
}


and catch native runtime exceptions on Python or Java:


try:
myObject.doSomething(3)
except RuntimeError, e:
print e

>> dbId already in use

Thursday, February 01, 2007

clocky, the alarm clock that runs away

Awesome

The alarm clock that runs away and hides when you don't wake up. Clocky gives you one chance to get up. But if you snooze, Clocky will jump off of your nightstand and wheel around your room looking for a place to hide. Clocky is kind of like a misbehaving pet, only he will get up at the right time.

Sunday, January 28, 2007

The CommSec iPod index - Global comparisons

I must say I predicted these results miliseconds before clicking on the article link. Brazil would top this list for sure:

Overall analysis


CommSec iPod nano index

2 gigabytes, US dollars

January 2007


Brazil $327.71

India $222.27

Sweden $213.03

Denmark $208.25

Belgium $205.81

France $205.80

Finland $205.80

Ireland $205.79

UK $195.04

Austria $192.86

Netherlands $192.86

Spain $192.86

Italy $192.86

Germany $192.46

China $179.84

Korea $176.17

Switzerland $175.59

NZ $172.53

Australia $172.36

Taiwan $164.88

Singapore $161.25

Mexico $154.46

US $149.00

Japan $147.63

Hong Kong $147.63

Canada $144.20

Source: CommSec, Apple


Also related: Something I'd like to ask an economist