Hello, it's me

Anna Hidalgo Costa

and I'm a Data Analyst and Machine Learning

with experience in data exploration, cleaning, and modeling for strategic decision-making, including machine learning solutions like diabetes risk prediction.
Expert in Power BI, SQL, Python, Excel and Tableau processes and data analysis in both on-premise and cloud environments.
Applied these skills to develop a diabetes prediction model using clinical data, achieving 82% accuracy through feature engineering and XGBoost algorithms.
Passionate about automation and generating valuable insights, from building ETL pipelines to creating predictive models that support healthcare decisions.

annahico@gmail.com | +34 678567628 | Barcelona city, Spain


Download CV

Latest Projects

EDA Marvel vs DC

This project analyzes data from Marvel and DC franchises to identify patterns and perform key comparisons. The goal is to understand how various factors influence the success of their movies and series.

EDA London Crime

The project analyzes London's crime rates (2008-2016), focusing on violent crimes and their seasonal patterns. The dataset includes all crimes but lacks crime type details, requiring preprocessing. Daylight saving months will also be identified.

EDA Airbnb Model

In this project, I aim to analyze Airbnb listings data to uncover insights about pricing, availability, and key factors that influence rental trends. The dataset provides detailed information on various properties, including location, price, number of reviews, and availability.

Hackathon Talent Arena

Developed a web app to analyze and visualize gentrification trends, using Python, Pandas, and React.js. Integrated data insights into the app with Node.js and MongoDB, ensuring real-time accuracy. Collaborated in an agile team to deliver an innovative, impactful solution.

Academic ML

Development of a binary classification model to predict high academic performance in Portuguese and Mathematics subjects using student profile data (UCI Student Performance dataset). The objective is to evaluate the influence of socio-demographic, family, and school-related factors on academic success.

Diabetes ML

Development of a binary classification model to predict diabetes diagnosis using patient clinical data (Frankfurt Hospital Diabetes dataset). The objective is to evaluate the influence of metabolic, treatment history, and lifestyle factors on diabetes risk

About Me


Data Analyst | Machine Learning

Highly skilled and detail-oriented Data Analyst with a proven ability to transform raw data into actionable insights that drive strategic decision-making. Specialized in data exploration, cleaning, and modeling (including predictive ML models) to identify patterns and trends that fuel business growth.

Proficient in Power BI, SQL, Python (Scikit-learn, Pandas), Excel and Tableau, with experience in building dynamic visualizations, developing data pipelines, and generating impactful reports. Strong foundation in ETL processes and data management across on-premise and cloud environments (Azure, AWS, Google Cloud), including deployment of ML models.

Passionate about automation, optimizing workflows through scripting and machine learning solutions like diabetes risk prediction (achieved 82% accuracy with XGBoost). Adept at communicating complex findings clearly and effectively, enabling stakeholders at all levels to make informed decisions.

Continuously expanding knowledge in machine learning (binary classification, feature engineering), artificial intelligence, and big data analytics, leveraging data as a strategic asset to solve complex problems and drive innovation.


My Technical Skills

Programming Languages

Python:

Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn

SQL:

MySQL, PostgreSQL, Query Optimization

Others:

JavaScript, R (Basic)

Data Analysis

Techniques:

Exploratory Data Analysis (EDA), Data Wrangling

Processes:

Data Cleaning, Transformation, Feature Engineering

Tools:

Jupyter Notebooks, Google Colab

Data Visualization

Power BI:

DAX, Power Query, Data Modeling

Tableau:

Dashboards, Storytelling

Others:

Excel (Pivot Tables, Advanced Charts), Matplotlib/Seaborn

Databases

Relational:

PostgreSQL, MySQL, SQL Server

NoSQL:

MongoDB, Firebase

ETL:

Data Pipelines, Apache Airflow

Machine Learning

Libraries:

Scikit-learn, TensorFlow, Keras, PyTorch

Techniques:

Regression, Classification, Clustering, NLP

Model Evaluation:

Cross-validation, Hyperparameter Tuning

DevOps & Methodologies

Version Control:

Git, GitHub, GitLab, Bitbucket

Cloud Services:

AWS (S3, EC2, Lambda), Google Cloud, Azure

Workflow:

Agile, SCRUM, Kanban, CI/CD Pipelines

My Academic Background

Data Science Bootcamp

The Bridge - 600h (2024-2025)

Higher Degree in Web Application Development

Jesuïtes FP (In progress)

Introduction to Python

CIFO La Violeta - 100h (2024)

Master's in Education and ICT

Universitat Oberta de Catalunya (2023)

Degree in English Studies

Universitat de Barcelona (2016-2021)

My Certificates

Python Essentials 1

Cisco

View Certificate

Python Essentials 2

Cisco

View Certificate

IBM Data Analyst

IBM

View Certificate

Data Science

The Bridge

View Certificate