David Adrián Cañones Castellano
Contact details
- david.canones.castellano (at) gmail.com
- david.canones (at) whiteboxml.com
Summary
Data Science Leader with a passion for AI and Machine Learning. I've been thriving in the field since 2015, before it was a trending topic.
My journey started at Servinform, crafting automation solutions that saved significant resources. I further honed my skills at Kernel Analytics and Pragsis Bidoop, focusing on predictive analytics and energy prediction models.
In 2019, I co-founded WhiteBox, aiming to deliver top-tier Data & Analytics projects in Spain, the UK, and the US. We've grown significantly, maintaining high-quality standards and attracting top talent. In 2021, we branched out to become one of the first DaaS companies in Spain, launching DataMarket, a platform providing quality datasets for our clients.
My goal is to continually innovate, add value through data, and promote technical excellence within my team.
Selected Experience
Lead Data Scientist & Partner, WhiteBox (10/2019 - Today):
As the Lead Data Scientist and Partner at WhiteBox, my role is multifaceted, spanning technical leadership, project management, and client relations. I lead an accomplished team of data scientists, driving innovative solutions and supervising the execution of key projects.
Responsibilities:
- Leading the Data Science team, mentoring and overseeing their work, providing technical guidance and advice.
- Developing complex components of our projects and boosting the productivity of my team members.
- Collaborating closely with clients to understand their needs and identifying impactful data use cases.
Achievements:
- Finalist in the international edition of the AWS DeepRacer in Las Vegas, 2019.
- Led the development of the Bricomart demand forecasting model in 2020, optimizing inventory for over 900 references across 30 stores.
- Developed the Vodafone causal inference framework in 2021, enabling the establishment of cause-effect relationships in churn, network quality, and customer service.
Chief Data Officer (CDO) & Co-founder, DataMarket (1/2021 - Today):
Pioneering the DaaS (Data as a Service) landscape in Spain, providing companies with high-quality, diverse datasets to drive their business decision-making. We are committed to continuously enriching our data catalogue to cater to the evolving needs of our diverse clientele.
Responsibilities:
- Steering the company's strategic vision and fostering a data-centric culture.
- Overseeing data sourcing, curation, and distribution, ensuring high quality and relevance for our customers.
- Building and maintaining client relationships, understanding their unique data requirements, and delivering tailored data solutions.
Achievements:
- Proud to serve a growing roster of clients including Burger King, Adevinta, and Iryo, among others.
- Developed a robust platform offering instant access to a wide range of datasets, helping businesses complement their internal data and gain comprehensive insights.
- Positioned DataMarket as one of the first and foremost DaaS providers in Spain, playing a pivotal role in the country's data revolution.
Lead Teacher of Data & Analytics, Ironhack (7/2019 - 9/2021):
At Ironhack, I embraced the opportunity to combine my passion for Data Science with my enthusiasm for teaching and mentorship. I led data analytics classes for aspiring data scientists, contributing to the evolution of the tech industry in Spain by nurturing the next generation of data professionals.
Responsibilities:
- Crafting a comprehensive, cutting-edge curriculum for the Data Analytics course, focused on the practical application of Machine Learning, Data Visualization, and Data Engineering concepts.
- Guiding students through complex data projects, promoting a hands-on learning approach and fostering critical thinking.
- Providing personalized mentorship to students, helping them overcome technical challenges and guiding them in their career development.
Achievements:
- Successfully mentored over 100 students during 2 years, many of whom secured roles in top tech companies post-graduation.
- Developed a unique teaching methodology that simplified complex data concepts, earning a 95% satisfaction rating in student feedback consistently.
- Actively participated in local Data Science meetups and events, representing Ironhack and inspiring a broader audience to explore the world of data analytics.
Senior Data Scientist, Pragsis Bidoop (7/2018 - 10/2019):
My tenure at Pragsis Bidoop marked a pivotal point in my career, where I deepened my expertise in traditional Machine Learning and Deep Learning frameworks, and honed my skills in MLOps to scale Machine Learning models from prototypes to production.
Responsibilities:
- Building robust Machine Learning models leveraging traditional (scikit-learn, XGBoost, lightGBM) and Deep Learning (TensorFlow, Keras) frameworks.
- Scaling Machine Learning models from concept to deployment using distributed and parallel computing (Spark, Dask, Celery) and orchestration tools like Apache Airflow.
- Developing cutting-edge Computer Vision solutions using Google Edge TPUs, Nvidia GPUs, TensorFlow, and OpenCV.
Achievements:
- Enhanced power production forecasting for a 1GW cluster of 13 wind farms in Washington, USA. This led to a 10% reduction in forecasting error, resulting in substantial cost savings for our client.
- Contributed to the winning team in the Amazon Web Services DeepRacer League by developing a Reinforcement Learning algorithm.
- Created a real-time tracking system for person detection using Computer Vision techniques optimized for low-consumption hardware (Google Edge TPU, Raspberry Pi).
Data Scientist, Kernel Analytics (10/2017 - 7/2018):
During my tenure at Kernel Analytics, I was deeply involved in harnessing the power of data to drive valuable insights and predictions for our clients, with a focus on machine learning and big data frameworks. My role centered around KPI design, data pipeline creation, data visualization, and predictive modeling.
Responsibilities:
- Leveraging traditional (scikit-learn) and Big Data (Spark MLlib) frameworks to develop machine learning models.
- Developing interactive dashboards with Plotly Dash and Microsoft PowerBI for effective data exploration and visualization.
Achievements:
- Successfully designed and developed a Customer Experience Management (CEM) framework for a leading Mobile Operator. The pipeline ingested 3G/4G antenna data and created a model to relate Customer Experience with Churn and Complaints (contact center calls), resulting in the client gaining actionable insights about its mobile network infrastructure impact on customer experience.
- Developed a predictive model for a renowned Mobile Operator that could predict user complaints (contact center calls) based on consumption patterns and user personal profile. This led to automation of a part of support process for our client.
Data Scientist, Grupo Servinform (2/2015 - 10/2017):
During my tenure at Grupo Servinform, I leveraged advanced data science techniques and automation strategies to streamline processes and deliver impactful solutions for our clients. My work revolved around data pipeline design, natural language processing (NLP), and the development of a data product for healthcare research.
Responsibilities:
- Designed and implemented automated data pipelines capable of cleaning and ingesting data from diverse inputs into relational databases.
- Developed an innovative data product to simplify data exploration for pharmaceutical and healthcare researchers, enabling them to identify complex relationships between different data sources.
Achievements:
- Successfully developed an NLP model that could interpret user queries written in natural language and translate them into database queries (SQL).
- Conceptualized and developed an automation framework that integrated seamlessly with existing back-office processes, paving the way for a new business line (RPA) and the ability to automate previously unfeasible processes.
Education
EOI Business School (9/2014 - 9/2015):
MBA, Corporate Finance
Universidad de Sevilla (9/2007 - 9/2014):
MSc Industrial Engineering, Energy
Honors & Awards
AWS DeepRacer League Madrid, 3rd position (5/2019)
Machine Learning competition organized by AWS which consisted on developing a Reinforcement Learning model for an autonomous car. I got the 3rd position in Spanish competition and was member of the team that made the top 3 positions (Gold, Silver and Copper).
Courses & Certifications
- Tensorflow Developer Certificate, Google
- Deep Learning Specialization, deeplearning.io
- Tensorflow in Practice Specialization, deeplearning.io
- AI for Medicine Specialization, deeplearning.io
- Machine Learning, Stanford University
- Advanced Machine Learning Specialization, Higher School of Economics
- Learning From Data, Caltech
- The Analytics Edge, MIT
- Computational Thinking and Data Science, MIT
- Computer Science and Programming Using Python, MIT
- Data Manipulation at Scale: Systems and Algorithms, University of Washington
Technical Stack
Section | Technologies |
---|---|
Machine Learning | scikit-learn, XGBoost, lightGBM, CatBoost, H2O, MLlib, MLflow |
Deep Learning | TensorFlow, PyTorch |
Generative AI | LLMs, Diffusion Models, HuggingFace Transformers |
Computer Vision | OpenCV |
NLP | NLTK, spaCy, Fasttext |
In Memory | pandas, polars, NumPy, Apache Arrow |
Relational | PostgreSQL, SQLite, Oracle |
Big Data | Apache Spark, Hadoop, ClickHouse |
Cloud Computing | AWS, GCP, Azure |
Orchestration | Apache Airflow |
Languages | Python, R, SQL |
Visualization | Matplotlib, seaborn, Plotly, PowerBI |