@ AArete 2025
Context
- Previously, the task requires manually reviewing each incoming data field to determine its equivalent in internal schema, which is time-consuming, error-prone, and limits scalability
- Goal: Create a program to intelligently analyze and match client data to company standard data architecture fields to reduce onboarding time, improve consistency, and enable faster client integration
Final Code Strategies
- Priority First Approach
- Makes sure fields that are essential to analysis are mapped intelligently
- Logic is implemented to prioritize these fields throughout the whole process
- Hierarchical Decision Making
- Exact historical matches
- Pattern-based rules
- Fuzzy historical matching
- General similarity
- OTHER
- Domain Knowledge Integration
- Medical abbreviation dictionary
- Specific mapping rules
- Balancing Competing Goals
- Uniqueness
- Exceptions
- Quality
- Priority
Process

Final Product
- A Streamlit app where the user inputs test data and priority file, and gets a CSV or XLSX result file
