Posts

Showing posts from August, 2018

"Big Data" technologies

 Big Data programming model and technologies Tools: Anaconda -   open source (free).   Most popular at 33.4% in 2018.   Has R and Python versions.   IDEs: Jupyter, RStudio, Spyder, and JupyterLab     Editors: Jupyter, RStudio, Spyder, and Visual Studio Code   Platforms: Linux, macOS, Windows   Visualize your data: Matplotlib, Bokeh, Datashader, and Holoviews   Machine learning & deep learning models: Scikit-learn, Tensorflow, h20, and Theano   Analyze: Dask, numpy, pandas, and Numba Keras wrapper on top of Tensorflow. ================================================================ R 48.5% in 2018. R is "a language for statisticians built by statisticians." open source (free). ggplot2 SparkR bindings to run Spark on R. Disad -   1) hard to be productive in R (if no prior Matlab, SAS, or OCTAVE).   2) Limited at more general purposes. ================================================================ Python open source (f...