Solina Kim


  • Hi!
  • I'm Solina Kim, a computer science major at the University of Notre Dame.
  • As a rising junior dedicated to working as an engineer in the healthcare industry, I spent the past summer at University Health as a ML Engineering intern evaluating Text Analytics for Health, a Named Entity Recognition model for healthcare data.
  • I also have experience in other projects at the intersection of healthcare and computer science, such as transformer model development for SARS-CoV-2 inhibitors, and bioinformatics for malaria in Sub-Saharan Africa.
  • In my free time, I like to train models, build programs, play tetris, work out, travel, and more!

Tools and Skills

Neural Networks

Natural Language Processing

Tensorflow

C Language

C++

Python

R

SQL

Scala

ArcGIS

SeqMan

JavaScript

ReactJS

HTML/CSS

Communication

Leadership

Creative Thinking

Work Ethic

Experience

Indiana University Health

ML Engineering Intern

May - August 2022

  • Conducted error analysis on output from Microsoft’s Named Entity Recognition model on IU Health’s doctors’ notes, using Python Pandas and Scala in a Databricks environment.
  • Discovered issues in case-sensitivity and lack of knowledge transfer between model components.
  • Delivered novel insights on ways to collaborate with Microsoft to improve model performance to fit IU Health's data.
  • Proactively communicated and balanced the diverse needs of stakeholders at IU Health, Microsoft Cognitive Services, and AnalytiXIN.

Lucy Family Institue for Data and Society

Research Assistant

August 2021 - May 2022

  • Generative AI Design and Exploration of Nucleoside Analogs
  • Contributed to developing the Conditional Random Transformer model – a ML based algorithm which efficiently searches chemical space to generate limited quantity of molecules that are qualitatively similar to SARS-CoV-2 inhibitors – using Python RDKit, Tqdm, Pandas, and NumPy.
  • Analyzed Tanimoto similarity, Morgan fingerprints, pairwise similarity, and validity of molecules generated by CRT model using Python RDKit and Pandas.

World Health Organization & University of Notre Dame

Research Assistant

March 2021 - May 2022

  • ITS2 and CO1 Gene Sequence Analysis
  • Extracted and parsed all available CO1 and ITS2 sequences of sub-Sharan African Anopheles from NCBI and Bold Systems using R. Script also successfully detected genetic anomalies due to unknown species or human error in data submission.
  • Further analyzed anomalous genes using SeqMan and flagged potential novel species for physiochemical analysis on-site in Africa.

University of Notre Dame

Research Assistant @ Lobo Lab

March 2021 - May 2022

  • Accesibility to Malaria Treatment in Zambia
  • Processed GPS tracker data into shapefiles and generated map of roads to malaria treatment facilities using R.
  • Compared GPS tracker-generated map to satellite images of roads on ArcGIS to assess realistic quality of access to malaria treatment in Zambia.

Korea Centers for Disease Control and Prevention

Research Intern @ Insect-Borne Diseases Department

July 2019 - August 2019

  • Collected mosquito samples, manually identified species, and conducted PCR for the government’s insect-borne diseases surveillance system.
  • Designed and conducted experiments to test mosquito repellant products for client companies.

Projects

Credit Card Scam Detector

  • Developed a neural network, xgboosted decision tree, and logistic regression model for credit card scam detection.
  • Improved F1 scores by an average of 60% for all models to reach 0.85 - 0.90 on test set.
  • Implemented functions and pipelines to reliably and efficiently replicate and experiment.
  • Utilized Gridsearch CV for hyperparameter tuning, and explored rus, ros, SMOTE, and normalization to improve model performance on imbalanced dataset.

Machine Learning

Neural Networks

Tensorflow

Scikit-learn

ML Diagnostics

Preference Matching Algorithm

  • Developed an expanded version of the Gale-Shapley algorithm to fit unequal sets and incomplete, noncardinal preferences.
  • Implemented the ideated algorithm into a memory and run-time efficient program using deques and sets.
  • Embedded the program into an automated email UI with Python Pandas, ImapLib, and NumPy for professors to utilize easily.
  • Identified deliverables, project dependencies, design considerations, and distributed responsibility among team members for 2 month long project

Data Structures

Algorithms

ImapLib

Pandas

Teamwork

Kiwoom Securities API Tradebot

  • Developed a program to request, collect, pipe data from Kiwoom Security’s (S Korea’s #1 stock brokerage service) API.
  • Designed an Excel file to receive data from above program and output visualizations of metrics as requested by client.
  • Reduced client’s time spent on data parsing, processing, and analysis by 70% while maintaining his 40% annual average profit rate.

Python

Kiwoom API

pyqt5

client service

communication

Sociopolitical Factors of Covid-19 Response

  • Research was funded by the University of Notre Dame with a DaVinci Grant of $4390.
  • Defined a country-specific qualitative index of sociopolitical factors that affect data surveillance measures for pandemic control.
  • Analyzed the relationship between index and pandemic control effectiveness in China, South Korea, and USA based on surveys conducted in the 3 countries.

Political Philosophy

Statistics

R

Problem Solving

Project Management

Contact

I'm interested in working for a spirited company that values endless innovation, service, and work integrity.