CV - Skills and Qualifications

From Sinfronteras
Revision as of 11:37, 9 January 2023 by Adelo Vieira (talk | contribs)
Jump to: navigation, search





Programming and Software Development

Data Science

Other qualifications

  • I started programming around 15 years ago, when I was studying geophysics, so in this part of my career, as a geophysics, I started coding mathematical, engineering problems and Data analysis and Data processing topics (Signal analysis in particular: A signal is a function that conveys information about a phenomenon. For example, Sound, images and videos are considered to be signals) . One of my main projects in this area was developing programs to perform Seismic Wave Propagation Simulations (Seismic Modelling). During this experience, I got skills in Python, Matlab (which is a data analysis environement/or a numerical computing environement), Scilab and Shell scripting.


  • Research geophysicist at GRyDs

As a Research Geophysicist, I was responsible for performing a set of Signal analysis/Data processing tasks, and ensuring the correct integration and implementation of geophysical applications into a computer cluster platform.

  • Task automation using Shell scripting: Here I could mention the generation of images to create seismic waves propagation videos or the automatic generation of pdf reports using latex that contained details about the executed process: time vs. the features of the data generated (the amount of data generated).
  • I have skills in Matlab, Scilab, Python and Shell scripting that I got during my participation in an R&D Unit at Simón Bolívar University (The Parallel and Distributed Systems Group - GryDs). MATLAB (matrix laboratory) is a language and numerical computing environment. MATLAB allows data analysis and data visualization, matrix manipulations, and performing numerical computations. Matlab contains a huge library of functions that facilitate the resolution of many mathematical and engineering problems. For example, I used it for Signal Analysis, specifically for Seismic data analysis. it for Ex. 1 and Ex. 2:
    • Signal Processing in Geophysics
    • Ex.1: That allows defining the coordinates of the layers of a geological model by opening an image file of the geological model and selecting, by clicking with the mouse, a set of points (or coordinates) that define each of the layers of the geological model. These coordinates will be saved in a very particular format that will be used as input of another program that is in charge of building the Geological model entity used by another program to perform a Seismic Wave Propagation Modelling.


  • In the latest years I decided to reorient my career toward IT, specifically toward Data Sciences and Software Development.


  • During my Bsc. in Information Technology, I have developed an excellent academic level and a clear understanding of the most important Object-Oriented Principles and Concepts. I have developed several object-oriented Java applications.


  • I have a special interest for Web Development. I have also developed several Web Applications using different technologies:
  • HTML, CSS
  • PHP
  • JavaScript
  • But my main experience is using JavaScript frameworks:
  • React for the Frontend
  • Express.js for the backend. This is a Node.js framework: HTTP REST APIs
  • Dash: Python web application framework for building data analytic applications


So, I'm a programmer. Even if I haven't worked in a programming position for a long time, during my academic and professional experience I have worked in programming in several ocations. As I said I've been programming for 15 years. And during this time I have used many programming languages. I like programming so much that even when I'm writing a report I use a programming-based tool (Latex), I don't use a word processor like Microsoft Word. So, the programming logic, principles, and concepts of object-oriented programming, etc, is something that I'm really proficient in. Of course, I don't have 10 years experience working in a Software Developer role, so of course, you can ask me something about programming that I don't know, but you can be sure that I know how to program and that I'm able to learn any new programming language or concept in a very short time. So that is somethign that I really wanted to make clear, that I'm really proficient in programming... that I'm very confident about my programming skills...


Python experience:
I started using Python during my experience in Seismic data analysis. With four years of professional experience in this field, I developed a solid understanding of the most widely used Python libraries for Data Analysis, Machine Learning, and Data visualization, including Numpy, Pandas, Scikit-learn, SciPy, Matplotlib, Plotly, etc.

More recently, I have acquired experience in Python web development using Django and Dash. In this regard, I have gained hands-on experience building REST APIs with Django and I'm currently actively engaged in expanding my skills by learning FastAPI.

These are a couple of Web Apps I have developed using Python: http://dashboard.sinfronteras.ws/sentiment http://www.vglens.sinfronteras.ws/

You can see a more extensive list of my projects at: http://wiki.sinfronteras.ws/view/CV#Portfolio



  • Projects:
  • In this project, we have created a GUI Java (Swing) Application for a Zoo Management System.


  • In this project, we have created a GUI Java (Swing) Application that simulates a trading day of a simplified model of a stock market.














Data Analysis (Data Interpretation, Data modelling, performing analysis based on data), it is NOT something new for me at all, it's something that I have been working on for several years as a Geophysicist.


I am currently working on Business analysis/Data visualization...


During my career as a Geophysicist, I worked on Data Analysis amd Machine Learning. I worked, for example:

  • Signal Analysis (Seismic data analysis), which is a way of Time Series Analysis (and Time Series is an important topic in Data Analysis).
  • Well-Log analysis is also a kind of Data Analysis; where we analyse physical properties of the geologic formation (of the rocks) under the subsurfce.


I also worked in the creation/training and testing of Machine Learning model (Regression, classification) for Seismic data analysis and Borehole (well log) data analysis (HampsonRussell software by CGG).

  • Reservoirs classification / Estimation of reservoir properties:
    • To classify rock foramtions in the subsurface using measurements of physical properties of the rocks.
    • To predict some physical properties of the rocks (Porosity or Permiability) by using measurements of other properties. See this paper: Comparison of machine learning methods for estimating permeability and porosity of oil reservoirs via petro-physical logs - https://www.sciencedirect.com/science/article/pii/S2405656117301633
  • Seismic data classification


ML models for the prediction of reservoir properties can be trained (Supervised learning, Unsupervised learning, or Reinforcement learning techniques) by using seismic data, well logs data, core samples, and production data *(production data includes: Production rates (amount of hydrocarbons that are produced from the reservoir over time), Pressure and Temperature data over time). These models can learn from historical data to make predictions about the properties of new reservoirs.

Reservoir properties are important characteristics of an oil or gas reservoir that determine the amount of hydrocarbon resources that can be extracted from it. Some common reservoir properties that can be predicted using ML models include:

  • Porosity: Porosity is a measure of the amount of pore space in a reservoir rock. ML models can be used to predict porosity based on well logs and seismic data.
  • Permeability: Permeability is a measure of the ability of fluids to flow through a reservoir rock. ML models can be used to predict permeability based on well logs and production data.
  • Saturation: Saturation is a measure of the amount of hydrocarbons in a reservoir rock. ML models can be used to predict saturation based on well logs and production data.
  • Lithology: Lithology refers to the type of rock that makes up a reservoir. ML models can be used to predict lithology based on well logs and seismic data.


So, in the same way that we use a supervised algorithm (for example a linear regression method) for predicting the price of a house based on housing datae (like number of rooom, age of the house, lot size, etc.). In geophysics (or in petrophysics), we can use physical properties of the rocks to estimate some property of interest, such as permeability and porosity.


An oil well is a (drilling | a hole drilled) boring in the Earth that is designed to bring petroleum to the surface (Oil well ~ borehole).

An well-log is a record of measurements of physical properties of the geologic formations (the rocks in the subsurface) penetrated by a borehole. In other words, a well-log is a record of measurements of physical properties of the rocks as a function of depth. Some of the physical properties that are measured are: Resistivity, Natural radioactivity of the rocks-formations (Gamma Ray Log). Because radioactive elements tend to be concentrated in shales, the Gamma-ray log normally reflects the shale content of the formation. Sound wave velocity: measurement of the time required for a sound wave to travel a constant distance. The principle is that velocity of the rock decrease when the porosity increase.


Signal analysis:
There are many mathematical concepts related to signal analysis and thus to time series analysis that I've been using for a long time as a geophysicist, such as:

  • Time series and Discrete signals
  • Correlation, Auto-correlation, Cross-correlation
  • Regression methods (Linear regression)
  • Convolution and Deconvolution
  • and, of course, concepts related to signal analysis like, Fourier series, Fourier transform etc.
  • and many other concepts related to data analysis...


Examples of projects:

https://www.earthdoc.org/content/papers/10.3997/2214-4609.201800920


I also got a Diploma in Predictive Data Analytics that I got at CCT College, where I got a distinction. I have also compleate online courses on Data Analysis (mostly related to Python for data analysis); I have worked on NLP in my last 2 final degree projects; in my opinion, there is not a better way of learning something than working on a long academic project.

  • Text classification: Supervised Machine Learning for Fake News Detection:
  • Sentiment Analysis: Developing a Web Dashboard for analyzing Amazon's Laptop sales data:
  • In my final Bachelor (Honours) in IT I worked in Sentiment Analysis using Python. I specifically developed a Web Dashboard for analyzing Amazon's Laptop sales data, mainly to perform a Sentiment Analysis on Amazon customer reviews.
    • I have performed a Sentiment Analysis of Amazon customer reviews by using both, Lexicon-based and Machine Learning methods.
    • Lexicon-based Sentiment Analysis: One of the purposes of this study is to evaluate different Sentiment Analysis approaches. That is why I performed a Lexicon-based Sentiment Analysis using two popular Python libraries: Textblob and Vader Sentiment.
    • Machine Learning Sentiment Analysis: I have built a ML classifier for Sentiment Analysis using the Naive Bayes algorithm and an Amazon review dataset from Wang et al. (2010).It is important to notice that this is an extra result with respect to the initial objectives. I haven’t planned to carry out this studio. However, I realized that it was very beneficial to include another Sentiment Analysis approach. This has allowed me to evaluate and compare both approaches in terms of their performance.
    • In addition, a Word Emotion Association Analysis has been also performed. This analysis complements the polarity analysis by adding more details about the kind of emotions or sentiments (joy, anger, disgust, etc.) in customer reviews. This analysis was performed by using the NRC Word-Emotion Association Lexicon.



  • Projects:


  • This Application was developed using Python-Django Web framework


  • This Application was developed using:

  • Back-end: Node.js (Express) (TypeScript)

  • Front-end: React (TypeScript)








  • SQL
https://content.dsp.co.uk/the-key-responsibilities-of-a-database-administrator
  • Install and configure database system
  • Install and test new versions of the database management system (DBMS)
  • Database Backup and Recovery
  • Develop, manage and test back-up and recovery plans
  • Maintaining database backup and recovery infrastructure
  • Troubleshooting
  • Make data available to authorized persons
  • Create user documentation and user guidelines (tambien participe en la documentacion porque ellos no sabian si iban a usar este mediawiki siempre o si iban a desarrollar su propia aplicación. Se estaba haciendo una documentacion de la estructura de la base de datos para posibles migraciones)


  • Linux
  • I've been using Linux for about 15 years as my main OS. I consider myself a Linux power user, capable to program Shell Scripts and perform administrative tasks. I'm mostly a Debian-based systems user, but I have experience with the most popular flavors of Linux: Ubuntu, Red Hat, CentOS, Mint, SuSE.


  • Throughout my career, I have worked on several occasions in activities related to Linux administration:
  • Research geophysicist at GRyDs:
I was, for example, responsible for developing shell scripts for task automation and signal analysis.
  • WikiVox:
I had the opportunity to work in the installation and administration of a LAMP stack (LAMP is stand for Linux - Apache - MySQL - PHP). So all the softwares needed to host a web application on a Linux Server.
  • I have also developed a personal project, in which I perform an automatic backup of my personal data that is stored in my computer (and my Wiki) into a hard drive and into the cloud (Linux server). To do so, I have developed a shell script using technologies such as: rsync, ssh, sshpass, tar, zip, MySQL database backup, sed, gpg.


  • Wiki - Organize information into a cohesive, searchable and maintainable system.
    • One of the most important skills I have, which I usually find complicated to make understand its importance, is my Wiki management skills.
    • A Wiki is a website on which users can collaborate by creating and modifying content from the web browser. So, the best example is Wikipedia. In Wikipedia someone can create a article and then it can be modify online for other users. A Wiki is an outstanding tool to organize information into a cohesive, searchable and maintainable system that can be accessed and modified online. The benefits of a wiki to organize information are remarkable.
    I have a personal Wiki (based on the MediaWiki engine) where I document everything I'm learning and working on. So, I use a Wiki as a Personal knowledge management that allows me to organize information into a cohesive, searchable and maintainable system. The benefits that I've had using a Wiki are amazing. It has allowed me to learn in a more effective way; and most importantly, to constantly review and improve in important topics by providing a very convenient online access (so from anywhere) to an organized and structured information.
    Take a look at some of my Wiki pages: http://perso.sinfronteras.ws/index.php/Computer_Science_and_IT


  • Academic assistant at USB: Communication, Presentation and Leadership Skills

As an Academic Assistant, I was in charge of collaborating with the lecture by teaching some modules of the Geophysical Engineering program at Simón Bolívar University. I was usually in charge of a group of between 20 and 30 students during theoretical and practical activities.

  • Courses taught:
  • Seismic data processing: Concepts of discrete signal analysis (time series analysis), sampling, aliasing, and discrete Fourier transform. Conventional seismic data processing sequence.
  • Seismic methods: The convolutional model of the seismic trace. Propagation and attenuation of seismic waves. Interpretation of seismic sections.
  • Seismic reservoir characterization: Relations between the acoustic impedance and the petrophysical parameters. Well-Seismic Ties. Seismic data analysis (Inversion and AVO).
  • This experience has contributed to my professional development in two major areas:
  • By teaching modules, I have enhanced my technical geophysical knowledge.
  • I have also developed communication and presentation skills, as well as the leadership strategies needed to manage a group of students and to transfer knowledge effectively.


  • IDG: Communication and Sale Skills
    • My current job at IDG is about communication. First, because I'm working in a team, and we always have to reach targets as a team, and communication within the team is always the key to reach the targets. Secondly, because one of my main responsibilities is to call contacts (to call IT Managers) on behalf of our clients, and of course this is about effective and clear communication. I have to explain to the contact the reason for the call, the topic of the campaign, and most importantly, I have to communicate in a way that... well I have to create an atmosphere in the call where the contact is going to feel comfortable and is going to accept answering my questions.
    • In this position, I have improved my communication skills in French and English. I have learned how to build and maintain a professional relationship and improved my Active Listening Skills.
    • I also think that I have developed communication skills not only at work but also in other aspects of my life; you know I have always done team sports in a high competitive-level: Volleyball when I was a chield; I was member of the Volleyball team of my state and attended 1 national games; and Waterpolo at university, where I attended 5 National University Games; and those are activities where you develop, sometimes without being aware, you develop many communication skills.
    • I have to call IT Managers and establish and maintain a professional conversation with them in order to identify their next investments. So from this conversation we gather information about their next investment and this information is required from our clients (IT Companies: IBM, DELL, NetApp, etc) and they use this information next step of the sales process.
    • Let's say that IBM is looking to sell a particular product (A Cloud backup solution, for example). So, IBM requires IDG's services, asking for a number of contacts (IT Managers) that are planning to invest in backup solutions. Then, we establish a professional conversation with IT Managers from our database and identify those that are looking to invest in the product required for the client.
    • During the phone conversations, I have to explain the topic of the product that our clients are looking to sell and be able to handle objections. That is why this experience has made me aware of the latest solutions and technologies in which the most important IT companies are working on.
    • At IDG, I have also completed a Certified Sales training. During this course, I have learned and put into practice, the most important concepts of the sales process.
    • Prospecting, Preparation, Approach, Presentation, Handling objections, Closing, Follow-up
    https://www.lucidchart.com/blog/what-is-the-7-step-sales-process


  • Target and KPI
    • I'm used to work in a Target Working Environment because I'm currently working in a TWE at IDG.
    • At IDG we have to reach a daily target of about €650 per day.
    • To reach this target performance we need to generate what we call a «lead». A lead is a conversation that matches the criteria asked for the client. For example, if the client (Let's see IBM) is asking for contacts that are looking to invest in Backup solutions, then every time that we have a conversation in which the contact confirms to be looking for backup solutions; this contact represents a «lead».
    • So each lead that we generated has a price, and we need to generate as many leads as needed to reach the target of €650. So normally an easy lead worth about €65 and a complicated one about €180.
    • So, every day we need to fight to reach the target performance. We usually have many challenges to reach the target performance:
    • Data challenges: We make calls using particular data that has been prepared for a particular campaign. Many times you can make many calls but you don't reach the contacts that you are looking for. So you can spend your day making calls but not having conversations with the IT Manager. So if you are not reaching the contact, you can not make leads.
    • Hard campaign challenges: That means that we have a campaign in which the client is asking for a difficult criterion. Let's say, for example, that the client is asking for contacts that are looking to invest in a particular solution (SAP applications for example). That represents a campaign challenge because we have to reach a contact that is looking to invest, specifically, in this solution.
    • Solutions: There are a few techniques that we use to apply when we face the challenges. Change the data or the campaign you're working on is the first action we can take. But sometimes you can not change the campaign because we really need to deliver lead for those campaigns because we need to reach a certain number of leads the client is asking for. We usually make calls using a platform that makes the calls automatically taking the contact from the database related to the campaign you're working on. So usually we don't need to worry about the criteria (company size, job title, industry) of the contacts we are calling because the platform makes the calls. But when you have data problems, the solution is to research for contacts manually. So, that is a little tricky because you can try to call the best contact by doing manual research in the database, but you can spend a long time doing this research and that doesn't assure that you are going to reach the contact and get leads. So when you have good data you have to use the platform, otherwise, you should search for contacts manually. So in this manual research is where you have to propose ideas and develop a good methodology to be able to find good contacts and get leads. One of the techniques we apply when we have a hard campaign is, for example, if we get a lead from a particular company; we try to call other contacts from the same company because we know that this particular company is going to review in the product that the client is looking for.
    The other approach is to try to search new contacts on the internet (usually on Linkedin), but that is even more tricky because it is complicated to get reach a new contact and to get the lead. Here is where I wanted to say that I had an important contribution. So the problem with this external research is that most of the contact that you are going to find on Linkedin is already in our database. So it doesn't make sense. But I realized that when we are looking for business job titles (because sometimes we have campaigns in which the client is asking for business titles) it makes sense to do external research (on Linkedin) because our database is composed mostly for IT Professionals (we have some business contacts in our database, but not a lot) so the chance of finding a contact on Linkedin that is not in our database increase a lot. Therefore, it makes sense to do external research when looking for business contacts. By doing that, I was able to get a good number of leads for hard campaigns; and that is a concrete contribution that I made to my team.


  • Simón Bolívar University and background in Mathematics/Physics

    I'm an engineer from the most important scientific Venezuelan university, which is Simón Bolívar University; and really, I need to highlight the academic level and the quality of Simón Bolivar University. If you check now, Simón Bolívar University is still in a good place in the LatAm University Rankings; but the university has been widely affected by the difficult political situation in the country. I don't know if you have heard about the critical political and economical situation in Venezuela. But the fact is that in my time when I started my career, Simón Bolívar university was always in the top 10 of the best LatAm Universities with scientific and technological orientation.

    I have a very good background in formal and pure sciences, like mathematic and physic. I followed 7 pure maths and 5 pure physics courses; without counting all the applied geophysical courses that I followed with a high content of mathematics, physics, or chemistry.

    If you review the course content of an IT program you will find at most 2 mathematic courses. I really think that for an IT professional it is very important to have a good background in mathematic. For example, to be able to understand some computational concepts (functional programming for example) you need to have a good mathematical background.


  • Geophysisc:
Geophysics is an applied science, we said that is a multidisciplinary field, that uses physic, mathematic, and geology to study the internal constitution and history of the earth. The term geophysics sometimes refers only to solid earth applications: Earth's shape; its gravitational and magnetic fields; its internal structure and composition; its dynamics and their surface expression in plate tectonics, the generation of magmas, volcanism and rock formation. [Wikipedia]. One of the first pieces of geophysical evidence that was used to support the movement of lithospheric plates (Plate tectonics) came from paleomagnetism. This is based on the fact that rocks of different ages show a variable magnetic field direction.
One of the main applications of Geophysics is in oil exploration, that is the area where I have experience. I specifically worked
During my acadimic and professional experience as a Geophysicist, I was involved in several data analysis topics:
  • Seismic exploration - Seismic processing
I specialized in Seismic exploration for oil and gas, specifically in Seismic analysis and Seismic data processing, which theory or mathematical foundation is related to Data Science. You actually can say that Seismic data processing is a way of Data Science.
Seismic analysis is a kind of Signal analysis; and Signal analysis is closely related to Time series analysis. Statistical signal processing uses the language and techniques of mathematical time-series analysis, but also use other concepts and techniques like signal to noise, time/frequency domain transforms and other concepts specifically related to the physical problem under study. Of course, there are also many other concepts use in time series analysis applied to business and economics, such as time-series forecasting, trend analysis, etc. that are not present in the material on statistical signal processing; but in general Signal Analysis (which is the area where I have experience) is closely related to Data Analytics and Time-Series analysis in particular https://stats.stackexchange.com/questions/52270/relations-and-differences-between-time-series-analysis-and-statistical-signal-pr#:~:text=2%20Answers&text=As%20a%20signal%20is%20by,significant%20overlap%20between%20the%20two
======
The signal that is analysed in Seismic analysis (the seismic signal) is a Seismic wave. A Seicmic waves is an acoustic wave that propagates through the earth. So, this wave can be recorded to obtain a mathematical (or functional) representation of the seismic wave. This function (or signal), which is called a Seismogram, represents ground motion measurements as a function of time; and of course, these ground motions are related to the wave propagating through the earth.
The data tha we analyse in Seicmic Analysis (Seismic Data) consists on a large set of time series. These time series are called Seismograms or Seismic traces; but mathematically are just time series.
In physical terms, we can say that a seismogram is basically a representation of a seismic wave propagating into the subsurface. Now, in mathematical terms, a seismogram (seismic trace) is a time series of ground motion values (the ground motions are related to the wave propagating in the subsurface). In other words, a seismogram describes ground motions as a function of time.
In short, the purpose of seismic exploration is to create an image of the subsurface and to estimate the distribution of a range of properties - in particular, the fluid or gas content. This way the geophysicist is able to have a better idea of where oil or gas deposits can be located in the subsurface.
So, after the Seismic acquisition phase (that is something that I'm not going to explain now because I want to focus on the seismic data processing, that was my sector, and I wanted to explain the relationship with Data Sciences) the Seismic Data consists on a large set of time series. These time series are called Seismograms or Seismic trace; but mathematically are just time series.
I have worked in this area in my two thesis projects (bachelor and master's degrees). I have experience as an academic assistant of the course of Seismic Data processing at Simón Bolívar University; I have worked at the CGGVeritas processing center in Caracas and in an R&D Unit at PDVSA and Simón Bolívar University. So I have considerable experience in Seismic data processing, but I'm sure that the most important of all it's that I have the motivation to further developed my skills in Seismic Data processing, I am now incredibly motivated to pursue my career in Seismic data processing.
======
So, there are many mathematical concepts related to signal analysis and thus to time series analysis that I've been using for a long time as a geophysicist, such as:
  • Time series and Discrete signals
  • Correlation, Auto-correlation, Cross-correlation
  • Regression methods (Linear regression)
  • Convolution and Deconvolution
  • and, of course, concepts related to signal analysis like, Fourier series, Fourier transform etc.
What Seismic attributes are: https://wiki.aapg.org/Seismic_attributes



  • Projects:









  • Advanced experience with the most popular flavors of Linux: Debian, Ubuntu, Red Hat, CentOS
  • LAMP Administration: Apache, MySQL, PHP
  • Installation and Post-installation configurations
  • Users and Groups Administration
  • Modify File Permissions
  • Managing Processes
  • Backups
  • Network File System (NFS)
  • Remote Management with SSH