Data Scientist (Machine Learning & Pipeline Engineering)

Remote Full-time

About the position Kalamata Capital Group is a forward-thinking financial technology company committed to leveraging data-driven intelligence to support small business growth. We are seeking a highly skilled Data Scientist to develop predictive models, perform robust exploratory data analysis, and build scalable data pipelines that power key business decisions across the organization. The ideal candidate is an experienced data scientist with deep technical expertise in machine learning, data engineering workflows, and statistical modeling. This role will work closely with engineering, product, and analytics teams to design, validate, and deploy ML solutions that improve decision-making efficiency. Strong proficiency in Pandas, PySpark, and MongoDB is essential, along with the ability to write clean, reproducible, production-ready code. The successful candidate will be equally comfortable communicating complex analytical insights to non-technical stakeholders. Responsibilities • Exploratory Analysis & Data Profiling: Conduct EDA on large, complex datasets using Pandas and PySpark; assess data quality and structure. • Model Development: Build, tune, and evaluate supervised and unsupervised machine learning models (e.g., tree-based methods, regressions, boosting algorithms). • Pipeline Engineering: Design and implement reliable, maintainable machine learning pipelines and preprocessing workflows for production environments. • Data Management: Query and integrate MongoDB datasets; design efficient schemas and aggregation pipelines that support analytical and operational workloads. • Visualization: Create intuitive visualizations using seaborn, plotly, and matplotlib to support model diagnostics and business storytelling. • Reproducible Code: Write clean, modular, well-documented Python code (PEP8 compliant); maintain version control using Git. • Model Explainability: Apply model interpretation tools such as SHAP and LIME to evaluate feature impact and improve transparency. • Cross-Functional Collaboration: Partner with engineering, analytics, and product teams to translate business needs into actionable model-driven solutions. • Documentation: Produce clear technical memos, reports, and model documentation for internal stakeholders. Requirements • M.S. in Computer Science, Machine Learning, Computational Biology, or related quantitative field plus 3+ years of relevant experience , or equivalent combination of education and applied work. • Strong foundation in Linear Algebra, Probability, and Statistics. • Advanced proficiency with Pandas and PySpark for data cleaning, reshaping, merges, feature engineering, and workflow optimization. • Strong experience with MongoDB , including querying, indexing, and aggregation pipelines. • Deep knowledge of supervised/unsupervised ML techniques and tools (scikit-learn, XGBoost). • Solid understanding of optimization, regularization, loss functions, and evaluation metrics (AUC, precision, recall, RMSE). • Experience delivering end-to-end ML projects (data ingestion → modeling → evaluation → optional deployment). • Ability to write clean, reproducible code and maintain organized notebooks/scripts. • Excellent communication skills with the ability to translate analysis into business insights. • Ability to relocate to the New York metro area. Nice-to-haves • Experience with AWS tools (Glue, S3, DMS). • Familiarity with deep learning frameworks (PyTorch, TensorFlow). • Experience deploying models using FastAPI, Flask, AWS, or GCP. • SQL, data warehousing, or data versioning experience. • Software engineering best practices (testing, arenaflex/CD, code review). • Link to GitHub, GitLab, or portfolio of analytical/ML code. Benefits • Flexible work from home options available. Apply tot his job Apply tot his job

Apply Now

Experienced Remote Customer Service and Sales Representative - Dynamic Team Environment with Comprehensive Training, Benefits, and Flexible Working Hours in New York, NY

Remote Full-time

Experienced Full Stack Data Entry Specialist – Remote Work Opportunity at blithequark

Remote Full-time

Experienced Remote Virtual Support - FedEx Data Entry Specialist: Thrive in a Dynamic Work-From-Home Environment at arenaflex

Remote Full-time

Experienced Data Entry Specialist for Teens – No Experience Required, Work from Home, and Earn a Competitive Salary with blithequark

Remote Full-time

Data Scientist (Machine Learning & Pipeline Engineering)

Similar Opportunities

Vector Data Pipeline Engineers

Data pipeline Developer

[Remote] Senior Backend and Data Platform Engineer

Data Platform Engineer (Python)

[Remote] HHS - Privacy SME/Privacy Analyst

Experienced Data Protection & Privacy Analyst – Remote Data Entry Position at arenaflex

Experienced Privacy Compliance Specialist for Shopping Privacy & Trust – Remote Opportunity at bolthires

Senior Security Engineer (Privacy + Compliance)

Data Product Owner (Remote)

Experienced Full Stack Data Product Manager – Data Modeling Focus at blithequark

Social Media Strategist, Regulated

Pharmacy Advisor

Experienced Remote Data Entry Specialist – Part-Time Opportunity at blithequark

Experienced Remote Customer Service and Sales Representative - Dynamic Team Environment with Comprehensive Training, Benefits, and Flexible Working Hours in New York, NY

Experienced Full Stack Data Entry Specialist – Remote Work Opportunity at blithequark

Experienced Remote Virtual Support - FedEx Data Entry Specialist: Thrive in a Dynamic Work-From-Home Environment at arenaflex

Experienced Data Entry Specialist for Teens – No Experience Required, Work from Home, and Earn a Competitive Salary with blithequark

Motion Graphic Designer Job at Twine in Indiana

Part-Time Customer Support Representative – Flexible Hours and Competitive Benefits at arenaflex in Killeen, Texas

Senior Business Development Representative

Data Scientist (Machine Learning & Pipeline Engineering)

Similar Opportunities

Vector Data Pipeline Engineers

Data pipeline Developer

[Remote] Senior Backend and Data Platform Engineer

Data Platform Engineer (Python)

[Remote] HHS - Privacy SME/Privacy Analyst

**Experienced Data Protection & Privacy Analyst – Remote Data Entry Position at arenaflex**

Experienced Privacy Compliance Specialist for Shopping Privacy & Trust – Remote Opportunity at bolthires

Senior Security Engineer (Privacy + Compliance)

Data Product Owner (Remote)

Experienced Full Stack Data Product Manager – Data Modeling Focus at blithequark

Social Media Strategist, Regulated

Pharmacy Advisor

**Experienced Remote Data Entry Specialist – Part-Time Opportunity at blithequark**

Experienced Remote Customer Service and Sales Representative - Dynamic Team Environment with Comprehensive Training, Benefits, and Flexible Working Hours in New York, NY

**Experienced Full Stack Data Entry Specialist – Remote Work Opportunity at blithequark**

**Experienced Remote Virtual Support - FedEx Data Entry Specialist: Thrive in a Dynamic Work-From-Home Environment at arenaflex**

Experienced Data Entry Specialist for Teens – No Experience Required, Work from Home, and Earn a Competitive Salary with blithequark

Motion Graphic Designer Job at Twine in Indiana

Part-Time Customer Support Representative – Flexible Hours and Competitive Benefits at arenaflex in Killeen, Texas

Senior Business Development Representative

Experienced Data Protection & Privacy Analyst – Remote Data Entry Position at arenaflex

Experienced Remote Data Entry Specialist – Part-Time Opportunity at blithequark

Experienced Full Stack Data Entry Specialist – Remote Work Opportunity at blithequark

Experienced Remote Virtual Support - FedEx Data Entry Specialist: Thrive in a Dynamic Work-From-Home Environment at arenaflex