Experience
Education
Bio
I recently came across this little data challenge, which was posted by Zalando (one of the top fashion retailers in Europe) as a teaser for data scientists/analysts. The challenge is quite straightforward and is a good opportunity to show how to deal with this kind of analysis using the standard tools of python and the interactive notebook.
For data analysis, the community is in two minds between between python and R, but for spatial data it looks like the ecosystem has taken a bet on python. There are useful python libraries for all stages of a geoprocessing pipeline, from data handling (shapely, GDAL/ogr, pyproj, ...) to analysis (shapely, (geo)pandas, PySal, numpy/scipy, sklearn, etc) to plotting and visualisation (matplotlib, descartes, cartopy, pyQGIS). I've always found the last stage (visualisation) to be the most negelected, and trying to join up the data and analysis threads with interactive web-based visualisations (like and D3 and mapping libraries like leaflet) is usually a pain. The ipython notebook promises interactivity, and newer packages like Plotly and Bokeh are really pushing it along fast.