Difference between revisions of "Data Science"
Adelo Vieira (talk | contribs) (→Git and GitHub) |
Adelo Vieira (talk | contribs) |
||
Line 137: | Line 137: | ||
https://stackoverflow.com/questions/19181999/how-to-create-a-keyboard-shortcut-for-sublimerepl | https://stackoverflow.com/questions/19181999/how-to-create-a-keyboard-shortcut-for-sublimerepl | ||
+ | |||
+ | ==Data Mining with R, Chapman and Hall== | ||
+ | |||
+ | ===Chapter 3 - Predicting Stock Market Returns=== | ||
+ | We will address some of the difficulties of incorporating data mining tools and techniques into a concrete business problem. The spe- | ||
+ | cific domain used to illustrate these problems is that of automatic «stock trading systems» (sistemas de comercio de acciones). We will address the task of building a stock trading system based on prediction models obtained with daily stock quotes data. Several models will be tried with the goal of predicting the future returns of the S&P 500 market index (The Standard & Poor's 500, often abbreviated as the S&P 500, or just the S&P, is an American stock market index based on the market capitalizations of 500 large companies having common stock listed on the NYSE or NASDAQ). These predictions will be used together with a trading strategy to reach a decision regarding the market orders to generate. | ||
+ | |||
+ | This chapter addresses several new data mining issues, among which are | ||
+ | * How to use R to analyze data stored in a database, | ||
+ | * How to handle prediction problems with a time ordering among data observations (also known as time series), and | ||
+ | * An example of the difficulties of translating model predictions into decisions and actions in real-world applications. |
Revision as of 12:18, 27 October 2018
Contents
[hide]Social Media Sentiment Analysis
https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397
https://elitedatascience.com/machine-learning-projects-for-beginners#social-media
https://en.wikipedia.org/wiki/Sentiment_analysis
https://en.wikipedia.org/wiki/Social_media_mining
Remote development
Eclipse - Connect to a remote file system
https://us.informatiweb.net/tutorials/it/6-web/148--eclipse-connect-to-a-remote-file-system.html
Mount a remote filesystem in your local machine
https://stackoverflow.com/questions/32747819/remote-java-development-using-intellij-or-eclipse
https://serverfault.com/questions/306796/sshfs-problem-when-losing-connection
https://askubuntu.com/questions/358906/sshfs-messes-up-everything-if-i-lose-connection
https://askubuntu.com/questions/716612/sshfs-auto-reconnect
root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o reconnect,ServerAliveInterval=5,ServerAliveCountMax=3 root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o allow_other root@sinfronteras.ws: /home/adelo/1-system/3-cloud
faster way to mount a remote file system than sshfs:
https://superuser.com/questions/344255/faster-way-to-mount-a-remote-file-system-than-sshfs
Git and GitHub
Installing Git
https://www.digitalocean.com/community/tutorials/how-to-install-git-on-ubuntu-18-04
sudo apt install git
Configuring GitHub
https://www.howtoforge.com/tutorial/install-git-and-github-on-ubuntu/
We need to set up the configuration details of the GitHub user. To do this use the following two commands by replacing "user_name" with your GitHub username and replacing "email_id" with your email-id you used to create your GitHub account.
git config --global user.name "user_name" git config --global user.email "email_id"
git config --global user.name "adeloaleman" git config --global user.email "adeloaleman@gmail.com"
Creating a local repository
git init /home/adelo/1-system/1-disco_local/1-mis_archivos/1-pe/1-ciencia/1-computacion/1-programacion/GitHubLocalRepository
Creating a README file to describe the repository
Now create a README file and enter some text like "this is a git setup on Linux". The README file is generally used to describe what the repository contains or what the project is all about. Example:
vi README
This is Adelo's git repo
Adding repository files to an index
This is an important step. Here we add all the things that need to be pushed onto the website into an index. These things might be the text files or programs that you might add for the first time into the repository or it could be adding a file that already exists but with some changes (a newer version/updated version).
Here we already have the README file. So, let's create another file which contains a simple C program and call it sample.c. The contents of it will be:
vi sample.c
#include<stdio.h>
int main()
{
printf("hello world");
return 0;
}
So, now that we have 2 files:
README and sample.c
add it to the index by using the following 2 commands:
git add README git add smaple.c
Note that the "git add" command can be used to add any number of files and folders to the index. Here, when I say index, what I am referring to is a buffer like space that stores the files/folders that have to be added into the Git repository.
Committing changes made to the index
Once all the files are added, we can commit it. This means that we have finalized what additions and/or changes have to be made and they are now ready to be uploaded to our repository. Use the command:
git commit -m "some_message"
"some_message" in the above command can be any simple message like "my first commit" or "edit in readme", etc.
Creating a repository on GitHub
Create a repository on GitHub. Notice that the name of the repository should be the same as the repository's on the local system. In this case, it will be "Mytest". To do this login to your account on https://github.com. Then click on the "plus(+)" symbol at the top right corner of the page and select "create new repository". Fill the details as shown in the image below and click on "create repository" button.
Once this is created, we can push the contents of the local repository onto the GitHub repository in your profile. Connect to the repository on GitHub using the command:
git remote add origin https://github.com/adeloaleman/GitHubLocalRepository
Pushing files in local repository to GitHub repository
The final step is to push the local repository contents into the remote host repository (GitHub), by using the command:
git push origin master
GUI Clients
Git comes with built-in GUI tools for committing (git-gui) and browsing (gitk), but there are several third-party tools for users looking for platform-specific experience.
Parece que la aplicación oficial GitHub Desktop no está disponible para Ubuntu. Entonces hay otras aplicaciones similares disponibles para Linux: https://git-scm.com/download/gui/linux
Para Linux existe, por ejemplo: https://www.gitkraken.com/
Anaconda
Anaconda is a free and open source distribution of the Python and R programming languages for data science and machine learning related applications (large-scale data processing, predictive analytics, scientific computing), that aims to simplify package management and deployment. Package versions are managed by the package management system conda. https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)
Installation
https://www.anaconda.com/download/#linux
https://linuxize.com/post/how-to-install-anaconda-on-ubuntu-18-04/
Jupyter Notebook
https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook
Cursos
eu.udacity.com
https://classroom.udacity.com/courses/ud120
www.coursera.org
https://www.coursera.org/learn/machine-learning/home/welcome
Otros
https://www.udemy.com/machine-learning-course-with-python/
https://stackoverflow.com/questions/19181999/how-to-create-a-keyboard-shortcut-for-sublimerepl
Data Mining with R, Chapman and Hall
Chapter 3 - Predicting Stock Market Returns
We will address some of the difficulties of incorporating data mining tools and techniques into a concrete business problem. The spe- cific domain used to illustrate these problems is that of automatic «stock trading systems» (sistemas de comercio de acciones). We will address the task of building a stock trading system based on prediction models obtained with daily stock quotes data. Several models will be tried with the goal of predicting the future returns of the S&P 500 market index (The Standard & Poor's 500, often abbreviated as the S&P 500, or just the S&P, is an American stock market index based on the market capitalizations of 500 large companies having common stock listed on the NYSE or NASDAQ). These predictions will be used together with a trading strategy to reach a decision regarding the market orders to generate.
This chapter addresses several new data mining issues, among which are
- How to use R to analyze data stored in a database,
- How to handle prediction problems with a time ordering among data observations (also known as time series), and
- An example of the difficulties of translating model predictions into decisions and actions in real-world applications.