Data Science
This is a protected page.
Contents
Projects portfolio
-
Try the App at http://dashboard.sinfronteras.ws
-
Github repository: https://github.com/adeloaleman/AmazonLaptopsDashboard
-
Visit the Web App at http://www.vglens.sinfronteras.ws
-
This Application was developed using Python-Django Web framework
-
Visit the Web App at http://62.171.143.243
-
Github repository: https://github.com/adeloaleman/WebApp-CloneOfTwitter
-
This Application was developed using:
-
Back-end: Node.js (Express) (TypeScript)
-
Front-end: React (TypeScript)
-
Visit the Web App at http://fakenewsdetector.sinfronteras.ws
-
Github repository https://github.com/adeloaleman/RFakeNewsDetector
Data Analytics courses
Data Science courses
- Posts
- Top 50 Machine Learning interview questions: https://www.linkedin.com/posts/mariocaicedo_machine-learning-interviews-activity-6573658058562555904-CzeV
- https://www.linkedin.com/feed/update/urn:li:ugcPost:6547849699011977216/
- Udemy: https://www.udemy.com/
- Python for Data Science and Machine Learning Bootcamp - Nivel básico
- Machine Learning, Data Science and Deep Learning with Python - Nivel básico - Parecido al anterior
- Data Science: Supervised Machine Learning in Python - Nivel más alto
- Mathematical Foundation For Machine Learning and AI
- The Data Science Course 2019: Complete Data Science Bootcamp
- Coursera - By Stanford University
- Udacity: https://eu.udacity.com/
- Columbia University - COURSE FEES USD 1,400
Possible sources of data
What is data
It is difficult to define such a broad concept, but the definition that I like it that data is a collection (or any set) of characters or files, such as numbers, symbols, words, text files, images, files, audio files, etc, that represent measurements, observations, or just descriptions, that are gathered and stored for some purpose. https://www.mathsisfun.com/data/data.html https://www.computerhope.com/jargon/d/data.htm
Qualitative vs quantitative data
https://learn.g2.com/qualitative-vs-quantitative-data
Qualitative data | Quantitative data |
---|---|
Qualitative data is descriptive and conceptual information (it describes something) | Quantitative data is numerical information (numbers) |
It is subjective, interpretive, and exploratory | It is objective, to-the-point, and conclusive |
It is non-statistical | It is statistical |
It is typically unstructured or semi-structured. | It is typically structured |
Examples:
See unstructured data examples below. |
Examples:
See structured data examples below. |
Discrete and continuous data
https://www.youtube.com/watch?v=cz4nPSA9rlc
Quantitative data can be discrete or continuous.
- Continuous data can take on any value in an interval.
- We usually say that continuous data is measured.
- Examples:
- Measurements of temperature: ºF.
- Temperature can be any value within an interval and it is measured (not counted)
- Discrete data can only have specific values.
- We usually say that discrete data is counted.
- Discrete data is usually (but not always) whole numbers:
- Examples:
- Possible values on a Dice Roller:
- Shoe sizes: . They are not whole numbers but can not be any number.
Structured vs Unstructured data
https://learn.g2.com/structured-vs-unstructured-data
http://troindia.in/journal/ijcesr/vol3iss3/36-40.pdf
Structured data | Unstructured data | Semi-structured data |
---|---|---|
Structured data is organized within fixed fields or columns, usually in relational databases (or spreadsheets) so it can be easily queried with SQL
https://learn.g2.com/structured-vs-unstructured-data https://www.talend.com/resources/structured-vs-unstructured-data |
It's data that doesn't fit easily into a spreadsheet or a relational database. | The line between Semi-structured data and Unstructured data has always been unclear. Semi-structured data is usually referred to as information that is not structured in a traditional database but contains some organizational properties that make its processing easier. |
|
|
For example, NoSQL documents are considered to be semi-structured data since they contain keywords that can be used to process the documents easier. https://www.youtube.com/watch?v=dK4aGzeBPkk |
It is important to highlight that the huge increase in data in the last 10 years has been driven by the increase in unstructured data. Currently, some estimations indicate that there are around 300 exabytes of data, of which around 80% is unstructured data.
The prefix exa indicates multiplication by the sixth power of 1000 ().
Some sources also suggest that the amount of data is doubling every 2 years.
Data Levels and Measurement
Levels of M