Fourth One-Week National Workshop on Statistical Techniques in Biological, Computational and Medical Sciences
Centre of Healthcare Technologies and Informatics(CEHTI), Department of Biotechnology and Bioinformatics & Department of Mathematics, JUIT
TOPICS TO BE COVERED
Statistics is all about converting data into useful information. Broadly this covers
- Collecting data,
- Summarizing and organizing data, and
- Interpreting data to generate meaningful information.
Biostatistics is the application of statistics to a variety of topics in biology. In this workshop, we tend to focus on Statistical applications in Biotechnology, Bioinformatics, Computer Science, Engineering and Health Sciences. In this workshop we intend to introduce a number of basic concepts and techniques that should enable the participants to get started with practical statistics using MS Excel, R and SPSS for statistical computing. Advanced statistical analyses and techniques with real time datasets will also be covered. There will be rigorous hands on sessions everyday to utilize the theory knowledge and to implement the concepts practically.
- Course Content (Theory):
- Need of Biostatistics: Building up the background, Why Statistics in Biosciences?
- Descriptive Statistics: Measurements, Data types, Centrality, dispersion, Skewness, Kurtosis, Bivariate data structure.
- Probability and Distribution: Random sampling, probability calculation, discrete distributions, continuous distributions.
- Testing Hypothesis: Population, sample, parameter, statistic, sampling techniques, common statistical methods, sample size calculation.
- Parametric tests: independent t test, paired t test.
- Non parametric tests: Mann Whitney U test (independent observations), Wilcoxon signed rank test (paired observations) and calculating the correlation coefficient.
- Analysis of Variance: One-way ANOVA, Two-way ANOVA with repeated trials, Three-way ANOVA with interaction.
- Design of Experiments: Principals of designs, CRD, RBD, LSD, Factorial design 22,23, 32, 33, complete and partial confounding.
- Multiple comparison procedures and drug analysis: Pairwise comparison, Successive comparisons, Comparisons with control, Comparisons with best, Order restricted comparisons, dose response experiments to detect minimum effective dose.
- Cluster analysis: Clustering Techniques and their applications
- Factor analysis and Principal component analysis (PCA): Reduce Number of Variables and Detect Relationships, Classification Techniques and their applications
- Correlation and Regression Analysis:
- Linear and non linear Regression: Simple linear regression, its Interpretation, model building strategies and diagnostics, non linear regression reducible in linear (Exponential, logarithmic and power regression analysis)
- Multiple Regression Analysis: Linear and non linear regression using matrix theory
- Testing in Regression: Coefficient of determination, ANOVA in regression
- Logistic Regression: Performing logistic regression, Interpretation of the output, model building methods and diagnostics.
- Multicollinearity: Variance inflation factor (VIF), Handling small n, large P problems
- Bioinformatics, Biomedical Informatics and Healthcare Technologies: Basic understanding of all these domains and their applications in disease management.
- Machine Learning and Deep Learning: Supervised and un-supervised learning techniques, understanding DL and ML techniques for their implementation towards AI based protocols and systems.
- Course Content (Labs and Hands on Sessions):
- Introduction and Data manipulation using R:
- Installation: Libraries & packages, online help.
- Using R: Basic arithmetic operations, creating variables, value labeling, functions, vectors, matrices & arrays, factors, data frame, indexing, conditional selection, subset & transform, sorting. importing data, exporting, data entry in R, looping, merging and reshaping.
- R Graphs: Histogram, vertical bar chart, pie chart, box plot, scatter plot, combining charts.
- Build in distributions and Packages in R: Normal distribution, Chi-square, F, performing ANOVA, regression analysis, Variance inflation factor (VIF), cluster analysis, factor analysis, Principal component analysis (PCA).
- Statistical analysis using MS Excel.
- Statistical analysis using SPSS.
- Introduction and Data manipulation using R: