<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ekemini</title>
    <description>The latest articles on Forem by Ekemini (@mbabah).</description>
    <link>https://forem.com/mbabah</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1423387%2F856ad6e9-4e74-4062-af44-71ace365a9ef.png</url>
      <title>Forem: Ekemini</title>
      <link>https://forem.com/mbabah</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mbabah"/>
    <language>en</language>
    <item>
      <title>The Role of AI in Precision Medicine &amp; Visual Cell Phenotyping</title>
      <dc:creator>Ekemini</dc:creator>
      <pubDate>Tue, 11 Feb 2025 08:58:58 +0000</pubDate>
      <link>https://forem.com/mbabah/the-role-of-ai-in-precision-medicine-visual-cell-phenotyping-3b82</link>
      <guid>https://forem.com/mbabah/the-role-of-ai-in-precision-medicine-visual-cell-phenotyping-3b82</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;br&gt;
The future of medicine is shifting from a one-size-fits-all approach to precision medicine, where treatments are customized based on a person’s genetic makeup, lifestyle, and environment. With the help of AI, deep learning, and bioinformatics, researchers can now analyze vast amounts of biological and imaging data to develop more effective, personalized treatments.&lt;/p&gt;

&lt;p&gt;One of the key enablers of precision medicine is visual cell phenotyping—the process of analyzing microscopy images of cells to detect patterns that indicate disease, treatment response, or cellular function. This technique, powered by deep learning, is unlocking new ways to diagnose diseases, discover drugs, and predict patient outcomes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How AI &amp;amp; Visual Cell Phenotyping Are Advancing Precision Medicine&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;1️⃣ &lt;em&gt;Early Disease Detection Through Cellular Imaging&lt;/em&gt;&lt;br&gt;
High-resolution microscopy, combined with AI, allows researchers to identify subtle changes in cell structure and behavior.&lt;br&gt;
Deep learning models can recognize phenotypic variations in cancer cells, neurological disorders, and infectious diseases at an early stage.&lt;/p&gt;

&lt;p&gt;2️⃣ &lt;em&gt;AI for Biomarker Discovery &amp;amp; Personalized Treatment&lt;/em&gt;&lt;br&gt;
Biomarkers are biological indicators of disease or treatment response.&lt;br&gt;
Deep learning helps analyze cellular images alongside genomic and transcriptomic data to discover new biomarkers for precision medicine.&lt;/p&gt;

&lt;p&gt;3️⃣ &lt;em&gt;Predicting Drug Responses Using AI-Powered Imaging&lt;/em&gt;&lt;br&gt;
Instead of relying on trial-and-error methods, AI can predict how different patients will respond to specific drugs based on cellular imaging data.&lt;br&gt;
This improves drug efficacy, reduces side effects, and helps create personalized treatment plans.&lt;/p&gt;

&lt;p&gt;4️⃣ &lt;em&gt;Integrating Genomics &amp;amp; Imaging for a Holistic View&lt;/em&gt;&lt;br&gt;
Genomic sequencing tells us about a person’s genetic risks, while cell imaging provides insights into how diseases manifest at the cellular level.&lt;br&gt;
By combining both, AI can help stratify patients, leading to more accurate diagnoses and better treatment strategies.&lt;br&gt;
Why This Matters for Data Science&lt;br&gt;
For data scientists interested in biomedicine and healthcare, this is an exciting area to explore:&lt;br&gt;
✔ Applying computer vision to analyze cellular imaging data.&lt;br&gt;
✔ Building predictive models for disease classification and drug response.&lt;br&gt;
✔ Using deep learning to extract meaningful insights from complex medical datasets.&lt;br&gt;
✔ Contributing to healthcare innovations that improve patient outcomes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thoughts&lt;/strong&gt;&lt;br&gt;
AI and deep learning are making precision medicine a reality by unlocking insights hidden in medical data. With visual cell phenotyping, we are now able to analyze diseases at the cellular level, leading to breakthroughs in diagnostics, drug discovery, and personalized treatments.&lt;/p&gt;

&lt;p&gt;The intersection of data science, AI, and biology is opening new frontiers in healthcare.&lt;/p&gt;

&lt;h1&gt;
  
  
  DataOcean #DataScience #DeepLearning #AIinBiology #Bioinformatics #ComputerVision
&lt;/h1&gt;

</description>
    </item>
    <item>
      <title>Exploring Multicollinearity: Strategies for Detecting and Managing Correlated Predictors in Regression Analysis</title>
      <dc:creator>Ekemini</dc:creator>
      <pubDate>Sun, 14 Apr 2024 17:13:14 +0000</pubDate>
      <link>https://forem.com/mbabah/exploring-multicollinearity-strategies-for-detecting-and-managing-correlated-predictors-in-regression-analysis-1ln</link>
      <guid>https://forem.com/mbabah/exploring-multicollinearity-strategies-for-detecting-and-managing-correlated-predictors-in-regression-analysis-1ln</guid>
      <description>&lt;p&gt;Multicollinearity is a statistical phenomenon that occurs when two or more independent variables in a regression model are highly correlated with each other. In other words, multicollinearity indicates a strong linear relationship among the predictor variables.This can make it difficult to interpret the individual effects of each predictor on the dependent variable because their effects may be confounded or exaggerated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reasons for Test of Multicollinearity&lt;/strong&gt;&lt;br&gt;
The primary reasons for conducting tests of multicollinearity in regression analysis are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Impact on Model Interpretation&lt;/li&gt;
&lt;li&gt;Inflated Standard Errors&lt;/li&gt;
&lt;li&gt;Unstable Estimates&lt;/li&gt;
&lt;li&gt;Reduced Model Performance&lt;/li&gt;
&lt;li&gt;Difficulty in Variable Selection&lt;/li&gt;
&lt;li&gt;Violation of Assumptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Checking for multicollinearity is crucial for building reliable regression models that accurately capture the relationships between variables and provide meaningful insights for decision-making.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset&lt;/strong&gt;&lt;br&gt;
After completing the data cleaning process, here are the first 5 rows of our dataset&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhaw2lxf315iqj8foxlhg.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhaw2lxf315iqj8foxlhg.PNG" alt="Dataset.head"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Imported Package&lt;/strong&gt;&lt;br&gt;
During the course of conducting the multicollinearity test, the following libraries were imported to facilitate data analysis and statistical computations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyin5fxbp3sc6sweka1g2.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyin5fxbp3sc6sweka1g2.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Feature Engineering&lt;/strong&gt;&lt;br&gt;
It is well known that multicollinearity detection methods rely on numerical data. These methods calculate correlation coefficients or variance inflation factors (VIFs) between predictor variables, which requires numerical inputs. If categorical variables are not encoded, multicollinearity checks cannot be accurately performed.&lt;br&gt;
From our dataset the location column is a categorical column containing 849 unique values:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr0p2g4aohjitb6d71q8c.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr0p2g4aohjitb6d71q8c.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For this reason, we encode the categorical column in our dataset. Using the categorical frequency encoding method.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2v3sfra1tu7s8g85w8eu.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2v3sfra1tu7s8g85w8eu.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Correlation Analysis&lt;/strong&gt;&lt;br&gt;
To deal only with the predictor variable we drop the target vector.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frz5sek7dm6tbmddybvpw.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frz5sek7dm6tbmddybvpw.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Correlation measures the strength and direction of the linear relationship between two variables, helping to identify multicollinearity issues and select the most relevant predictors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh8mp5h4f40z5f0mjaoj8.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh8mp5h4f40z5f0mjaoj8.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Assessing Multicollinearity with Heatmap Visualization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using a heatmap is an effective visual tool to assess multicollinearity by displaying correlation coefficients between variables.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foprupomd8hm8scbp72qd.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foprupomd8hm8scbp72qd.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multicollinearity Efficiency: Insights from an OLS Model Summary&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To check if our multicollinearity plot is efficient and good enough for model bulding, we apply the Ordinary least square summary.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2uc4zj5mcefnqnceg1m.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2uc4zj5mcefnqnceg1m.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model Summary&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbb07wu93067ww3vgc6y.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbb07wu93067ww3vgc6y.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From Our summary it shows that condition number is large implying there might be strong multicollinearity or other numerical problems. &lt;/p&gt;

&lt;p&gt;Therefore to confirm if the large condition number is as a result of multicollinearity, we apply the Variance Inflation Factor method.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2qwkag5fpv9e9ae381z.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2qwkag5fpv9e9ae381z.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Variance Inflation Factor Result&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgpz2qgi0vogz8n1aylzz.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgpz2qgi0vogz8n1aylzz.PNG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VIF Decision Key&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;VIF &amp;lt; 2: Minimal multicollinearity; no action needed.&lt;/li&gt;
&lt;li&gt;2 ≤ VIF &amp;lt; 5: Moderate multicollinearity; consider further investigation or data transformation.&lt;/li&gt;
&lt;li&gt;5 ≤ VIF &amp;lt; 10: High multicollinearity; problematic, requires attention (e.g., variable selection, data transformation).&lt;/li&gt;
&lt;li&gt;VIF ≥ 10: Severe multicollinearity; critical issue, immediate action needed (e.g., variable removal, data restructuring).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The VIF results suggest that there is no multicollinearity issue among the predictor variables in our regression model, since the values of VIF are lesser than the threshole of the VIF.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test Observation&lt;/strong&gt;&lt;br&gt;
From our analysis, we noticed a large condition number of 1.21e+05, indicating the potential presence of strong multicollinearity or other numerical problems within our regression model. To confirm this, we conducted a Variance Inflation Factor (VIF) analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt; &lt;br&gt;
Based on the VIF analysis, we can conclude that multicollinearity is not a significant issue in our regression model. The large condition number observed is likely due to numerical factors other than multicollinearity. Therefore, we can proceed with confidence in the validity of our regression analysis results.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>python</category>
      <category>data</category>
    </item>
  </channel>
</rss>
