Do you know any large dataset to experiment with Hadoop which is free/low cost? Any pointers/links related are appreciated.
Preference:
At least one GB of data.
Production log data of webserver.
Few of them which I found so far:
Also can we run our own crawler to gather data from sites e.g. Wikipedia? Any pointers on how to do this is appreciated as well.