From 8c42104bbda3d4c9426ea7283e78b56ca37302cb Mon Sep 17 00:00:00 2001 From: John Peach Date: Tue, 25 Jun 2019 07:16:35 -0700 Subject: [PATCH] updated 2018 README.md --- 2018-01-23_cart/README.md | 9 +++++++++ 2018-01-23_decision_trees/README.md | 9 ++++++++- 2018-02-27_exploring_decision_trees_with_r/README.md | 9 ++++++--- .../README.md | 9 +++++---- 2018-05-29_DataFest_DF3_Chapman/README.md | 10 ++++++---- 2018-05-29_DataFest_Data_Dirtbags/README.md | 10 ++++++---- 2018-05-29_DataFest_Seems_Logit/README.md | 10 ++++++---- 2018-05-29_DataFest_Team_Epsilon/README.md | 10 ++++++---- .../README.md | 11 +++++++---- 2018-07-31_city_of_LA-HR_contracts_analysis/README.md | 10 ++++++---- 2018-07-31_pacific_life_mortality_analysis/README.md | 8 +++++--- 2018-08-28_Cancer_Mutations_Scores/README.md | 9 +++++---- 2018-08-28_RStudio_TuneUp/README.md | 9 +++++---- 2018-09-25_Color_talk/README.md | 7 ++++--- .../README.md | 9 +++++---- .../README.md | 7 ++++--- 2018-11-27_Connecting_Javascript_to_Shiny/README.md | 9 +++++---- 2018-11-27_Data_Storytelling/README.md | 9 +++++---- 18 files changed, 103 insertions(+), 61 deletions(-) create mode 100644 2018-01-23_cart/README.md diff --git a/2018-01-23_cart/README.md b/2018-01-23_cart/README.md new file mode 100644 index 0000000..e6b63e1 --- /dev/null +++ b/2018-01-23_cart/README.md @@ -0,0 +1,9 @@ +# Categorical and Regression Trees (CART) + +* 2018-01-23 +* Speaker: John Peach + +## Abstract +This talk dives deep into the mathematical development of CART trees. It examines the +assumptions and trade-offs that go into the development of the model and then +how a tree is determined given a specific dataset. diff --git a/2018-01-23_decision_trees/README.md b/2018-01-23_decision_trees/README.md index f7f5578..3eb9125 100644 --- a/2018-01-23_decision_trees/README.md +++ b/2018-01-23_decision_trees/README.md @@ -1,3 +1,10 @@ -##Title: Decision Trees with R +# Using and Improving Decision Trees in R + +* 2018-01-23 +* Speaker: Robert Mohr + +## Abstract +This talk shows you how to use the various R packages to build trees and display +them. It then presents a way to improve the formatting of the tree output. diff --git a/2018-02-27_exploring_decision_trees_with_r/README.md b/2018-02-27_exploring_decision_trees_with_r/README.md index 1546988..bdc0efd 100644 --- a/2018-02-27_exploring_decision_trees_with_r/README.md +++ b/2018-02-27_exploring_decision_trees_with_r/README.md @@ -1,5 +1,8 @@ -#Speaker: Alfredo F. +# Exploring Tree Models -##Title: Exploring Tree Models +* 2018-02-27 +* Speaker: J. Alfredo Freites -###Abstract +## Abstract +This talk covers how to build and test trees in R. It covers the splitting +approach and walks through an example. diff --git a/2018-03-20_machine_learning_with_tree-based_models_in_r/README.md b/2018-03-20_machine_learning_with_tree-based_models_in_r/README.md index 648cc71..c19e74b 100644 --- a/2018-03-20_machine_learning_with_tree-based_models_in_r/README.md +++ b/2018-03-20_machine_learning_with_tree-based_models_in_r/README.md @@ -1,6 +1,7 @@ -#Speaker: Bella Feng +# Machine Learning with Tree-Based Models in R -##Title: Machine Learning with Tree-Based Models in R +* 2018-03-20 +* Speaker: Bella Feng -###Abstract -Bella's talk will summarize what she learned from this datacamp class, covering Classification & Regression Trees, Bagged Trees and Random Forests. Code for building, evaluating and tuning model parameters will be shared. If time permits, we will also look into strategy of getting into a kaggle competition with the different tools. \ No newline at end of file +## Abstract +Bella's talk will summarize what she learned from this datacamp class, covering Classification & Regression Trees, Bagged Trees and Random Forests. Code for building, evaluating and tuning model parameters will be shared. If time permits, we will also look into strategy of getting into a kaggle competition with the different tools. diff --git a/2018-05-29_DataFest_DF3_Chapman/README.md b/2018-05-29_DataFest_DF3_Chapman/README.md index c7388be..2f147c6 100644 --- a/2018-05-29_DataFest_DF3_Chapman/README.md +++ b/2018-05-29_DataFest_DF3_Chapman/README.md @@ -1,7 +1,9 @@ -#Speaker: DF3 +# Chapman University DataFest 2018 - Team DF3 -##Title: DataFest 2018 +* 2018-05-29 +* Speaker: DataFest Team DF3 - Chapman University -###Abstract + +## Abstract (from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." -https://ww2.amstat.org/education/datafest/ \ No newline at end of file +https://ww2.amstat.org/education/datafest/ diff --git a/2018-05-29_DataFest_Data_Dirtbags/README.md b/2018-05-29_DataFest_Data_Dirtbags/README.md index c94ce3a..07aa611 100644 --- a/2018-05-29_DataFest_Data_Dirtbags/README.md +++ b/2018-05-29_DataFest_Data_Dirtbags/README.md @@ -1,7 +1,9 @@ -#Speaker: Data Dirtbags +# Chapman University DataFest 2018 - Team Data Dirtbags -##Title: DataFest 2018 +* 2018-05-29 +* Speaker: DataFest Team Data Dirtbags -###Abstract + +## Abstract (from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." -https://ww2.amstat.org/education/datafest/ \ No newline at end of file +https://ww2.amstat.org/education/datafest/ diff --git a/2018-05-29_DataFest_Seems_Logit/README.md b/2018-05-29_DataFest_Seems_Logit/README.md index d03042c..6b5341e 100644 --- a/2018-05-29_DataFest_Seems_Logit/README.md +++ b/2018-05-29_DataFest_Seems_Logit/README.md @@ -1,7 +1,9 @@ -#Speaker: Seems Logit +# Chapman University DataFest 2018 - Team Seems Logit -##Title: DataFest 2018 +* 2018-05-29 +* Speaker: DataFest Team Seems Logit -###Abstract + +## Abstract (from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." -https://ww2.amstat.org/education/datafest/ \ No newline at end of file +https://ww2.amstat.org/education/datafest/ diff --git a/2018-05-29_DataFest_Team_Epsilon/README.md b/2018-05-29_DataFest_Team_Epsilon/README.md index 69d1988..2e8cad8 100644 --- a/2018-05-29_DataFest_Team_Epsilon/README.md +++ b/2018-05-29_DataFest_Team_Epsilon/README.md @@ -1,7 +1,9 @@ -#Speaker: Team Epsilon +# Chapman University DataFest 2018 - Team Epsilon -##Title: DataFest 2018 +* 2018-05-29 +* Speaker: DataFest Team Epsilon -###Abstract + +## Abstract (from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." -https://ww2.amstat.org/education/datafest/ \ No newline at end of file +https://ww2.amstat.org/education/datafest/ diff --git a/2018-05-29_DataFest_Terminal_R_CSU_Fullerton/README.md b/2018-05-29_DataFest_Terminal_R_CSU_Fullerton/README.md index d996ebd..0ded229 100644 --- a/2018-05-29_DataFest_Terminal_R_CSU_Fullerton/README.md +++ b/2018-05-29_DataFest_Terminal_R_CSU_Fullerton/README.md @@ -1,7 +1,10 @@ -#Speaker: The TerminatR +# Chapman University DataFest 2018 - Team Terminal R +## Maximizing Job Posting Success -##Title: Maximizing Job Posting Success +* 2018-05-29 +* Speaker: DataFest Team Terminal R - CSU Fullerton -###Abstract + +## Abstract (from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." -https://ww2.amstat.org/education/datafest/ \ No newline at end of file +https://ww2.amstat.org/education/datafest/ diff --git a/2018-07-31_city_of_LA-HR_contracts_analysis/README.md b/2018-07-31_city_of_LA-HR_contracts_analysis/README.md index d77e1f3..55b840f 100644 --- a/2018-07-31_city_of_LA-HR_contracts_analysis/README.md +++ b/2018-07-31_city_of_LA-HR_contracts_analysis/README.md @@ -1,6 +1,8 @@ -#Speaker: Gloria Gong +# UCI MSBA Capstone Project +## IBM Watson - The True Benefits of HR contracts -##Title: IBM Watson - The True Benefits of HR contracts +* 2018-07-31 +* Speaker: Gloria Gong -###Abstract -Our team project is to help determine the true benefits costs in Human Resource contracts and we were working with IBM and the City of Los Angeles for our capstone. We used IBM Watson software (mainly Knowledge Studio and Natural Language Understanding) to conduct an analysis and our target data was the MOUs published on the City of LA website. Throughout the five-month project, we were given timetables in which we were to essentially learn the City MOUs breakdown and understand the MOUs. From the City’s part, questions/possibilities were brought up as in, what are the common and unique benefits for each MOU? Can Watson help identify more irrelevant information in City contracts, so they can be amended? Can we use the potential model to compare other City employment contracts/MOUs to their own? And lastly, how will the changes of benefit costs bring to social issues such as crime rate? Through creating dictionaries, applying pre-annotators, doing human annotating, training and evaluating the models, we achieved an overall model F1 score of 0.8. For detailed statistics, we got 0.5 in recall for ‘benefits’ and 0.26 for ‘eligibility’. Recalls for other entities are as high as above 0.7. We helped the City of LA in saving contract processing time and the model was able to analyze contract (average 19,000 words) within 10mins. \ No newline at end of file +## Abstract +Our team project is to help determine the true benefits costs in Human Resource contracts and we were working with IBM and the City of Los Angeles for our capstone. We used IBM Watson software (mainly Knowledge Studio and Natural Language Understanding) to conduct an analysis and our target data was the MOUs published on the City of LA website. Throughout the five-month project, we were given timetables in which we were to essentially learn the City MOUs breakdown and understand the MOUs. From the City’s part, questions/possibilities were brought up as in, what are the common and unique benefits for each MOU? Can Watson help identify more irrelevant information in City contracts, so they can be amended? Can we use the potential model to compare other City employment contracts/MOUs to their own? And lastly, how will the changes of benefit costs bring to social issues such as crime rate? Through creating dictionaries, applying pre-annotators, doing human annotating, training and evaluating the models, we achieved an overall model F1 score of 0.8. For detailed statistics, we got 0.5 in recall for ‘benefits’ and 0.26 for ‘eligibility’. Recalls for other entities are as high as above 0.7. We helped the City of LA in saving contract processing time and the model was able to analyze contract (average 19,000 words) within 10mins. diff --git a/2018-07-31_pacific_life_mortality_analysis/README.md b/2018-07-31_pacific_life_mortality_analysis/README.md index 55bfc72..3f3191e 100644 --- a/2018-07-31_pacific_life_mortality_analysis/README.md +++ b/2018-07-31_pacific_life_mortality_analysis/README.md @@ -1,6 +1,8 @@ -#Speaker: Wanyi Huang +# UCI MSBA Capstone Project +## Pacific Life's Deferred Annuity -##Title: Pacific Life's Deferred Annuity +* 2018-07-31 +* Speaker: Wanyi Huang -###Abstract +## Abstract Our most essential task in this project is to review the mortality experience from Pacific Life. In particular, we are looking into the mortality from Pacific Life's Deferred Annuity block of business. A Deferred Annuity is a retirement investment vehicle and functions like a mutual fund if managed efficiently. Pacific Life is internally required to use their current model to run periodic projections for the purposes of hedging, pricing and various other activities. Our goal is to identify and analyze the policy attributes that led to the disparity from the model and improve the model according to the results of the analysis and potentially provide business intelligence regarding mortality that can support marketing and operational strategies. diff --git a/2018-08-28_Cancer_Mutations_Scores/README.md b/2018-08-28_Cancer_Mutations_Scores/README.md index 747afc1..6957115 100644 --- a/2018-08-28_Cancer_Mutations_Scores/README.md +++ b/2018-08-28_Cancer_Mutations_Scores/README.md @@ -1,6 +1,7 @@ -#Speaker: Yemi Odeyemi (Ph.D. Candidate in Data Science at Chapman University) +# Classification and Statistical Analysis of Cancer Mutations Scores -##Title: Classification and Statistical Analysis of Cancer Mutations Scores +* 2018-08-28 +* Speaker: Yemi Odeyemi (Ph.D. Candidate in Data Science at Chapman University) -###Abstract -The talk will describe part of Yemi's doctoral work on building a statistical and predictive model to classify driver-passenger mutations. A Logit model is used with 10-fold cross-validation. The data was preprocessed to impute missing values using the rule-of-thumb approach, removal of redundant features and feature scaling. Feature selection was determined using a stepwise approach based on AIC. The objective was to determine the optimal class boundary for the probability for discretization. The models were evaluated with Receiver Operator Characteristics - Area under the curve (ROC-AUC) which is based on sensitivity and specificity. \ No newline at end of file +## Abstract +The talk will describe part of Yemi's doctoral work on building a statistical and predictive model to classify driver-passenger mutations. A Logit model is used with 10-fold cross-validation. The data was preprocessed to impute missing values using the rule-of-thumb approach, removal of redundant features and feature scaling. Feature selection was determined using a stepwise approach based on AIC. The objective was to determine the optimal class boundary for the probability for discretization. The models were evaluated with Receiver Operator Characteristics - Area under the curve (ROC-AUC) which is based on sensitivity and specificity. diff --git a/2018-08-28_RStudio_TuneUp/README.md b/2018-08-28_RStudio_TuneUp/README.md index a5967d2..b17f742 100644 --- a/2018-08-28_RStudio_TuneUp/README.md +++ b/2018-08-28_RStudio_TuneUp/README.md @@ -1,6 +1,7 @@ -#Speaker: Ryan Benz (SoCal Bioinformatics, Inc.) +# Tune up your RStudio Experience -##Title: Tune up your RStudio Experience +* 2018-08-28 +* Speaker: Ryan Benz (SoCal Bioinformatics, Inc.) -###Abstract -In this talk, Ryan will discuss some of the ways you can tune-up your RStudio experience to make it easier to work with and to stream line commonly performed tasks to support your coding and analysis work. \ No newline at end of file +## Abstract +In this talk, Ryan will discuss some of the ways you can tune-up your RStudio experience to make it easier to work with and to stream line commonly performed tasks to support your coding and analysis work. diff --git a/2018-09-25_Color_talk/README.md b/2018-09-25_Color_talk/README.md index fa42de1..45e4e8b 100644 --- a/2018-09-25_Color_talk/README.md +++ b/2018-09-25_Color_talk/README.md @@ -1,6 +1,7 @@ -#Speaker: Emil Hvitfeldt +# Similarity measure in the space of color palettes -##Title: Similarity measure in the space of color palettes +* 2018-09-25 +* Speaker: Emil Hvitfeldt -###Abstract +## Abstract Related to my project of creating a catalog of all available color palettes in r https://github.com/EmilHvitfeldt/ r-color-palettes and its associated r package https://CRAN.R-project.org/package=paletteer I wanted to expand the project to support a higher degree of explorability. There is already quite a bit theory of color similarity and image similarity that will provide useful but unfortunately insufficient. For standard color similarity using some kind of space measure in a perceptual color space will likely give you what you need, but this approach will start to struggle when you need to compare groups of colors since ordering starts making a difference. Image similarity can likewise be done by comparing color histograms, this approach does still not capture the number of colors or other qualities such as classifying according to type (Sequential, Diverging or Qualitative). My goal is to use expert knowledge to calculate features that can be used to calculates similarities diff --git a/2018-10-30_How_Marketing_Teams_Leverage_Web_Service_APIs_to_support_their_Data_Science_initiatives/README.md b/2018-10-30_How_Marketing_Teams_Leverage_Web_Service_APIs_to_support_their_Data_Science_initiatives/README.md index a9d2b00..957d8a1 100644 --- a/2018-10-30_How_Marketing_Teams_Leverage_Web_Service_APIs_to_support_their_Data_Science_initiatives/README.md +++ b/2018-10-30_How_Marketing_Teams_Leverage_Web_Service_APIs_to_support_their_Data_Science_initiatives/README.md @@ -1,7 +1,8 @@ -#Speaker: Robert Thomas +# How Marketing Teams Leverage Web Service API’s to support their Data Science initiatives. -##Title: How Marketing Teams Leverage Web Service API’s to support their Data Science initiatives. +* 2018-10-30 +* Speaker: Robert Thomas -###Abstract: +## Abstract: -Examine how Marketing teams leverage Alteryx & R Programming Web Service API’s to support their data science projects. Additionally, emphasize how R Studio web services API’s moves data into a data warehouse to solve marketing analytics problems. \ No newline at end of file +Examine how Marketing teams leverage Alteryx & R Programming Web Service API’s to support their data science projects. Additionally, emphasize how R Studio web services API’s moves data into a data warehouse to solve marketing analytics problems. diff --git a/2018-10-30_Using_R_to_build_large-scale_models_with_Alteryx/README.md b/2018-10-30_Using_R_to_build_large-scale_models_with_Alteryx/README.md index ea1ddd2..83b341d 100644 --- a/2018-10-30_Using_R_to_build_large-scale_models_with_Alteryx/README.md +++ b/2018-10-30_Using_R_to_build_large-scale_models_with_Alteryx/README.md @@ -1,8 +1,9 @@ -#Speaker: Alteryx +# Using R to build large-scale models with Alteryx -##Title: Using R to build large-scale models with Alteryx +* 2018-10-30 +* Speaker: Alteryx -###Abstract: +## Abstract: Alteryx will demonstrate how you can leverage your R skills in Alteryx to build large-scale models and deploy them. Alteryx makes deploying a predictive model easy with the click of a button. Alteryx exposes an R and Python interface via standard REST API requests, instead of recoding the models from within their native languages, integration of your most advanced analytic models into production systems is simple and painless. build and deploy models. diff --git a/2018-11-27_Connecting_Javascript_to_Shiny/README.md b/2018-11-27_Connecting_Javascript_to_Shiny/README.md index fb7a9d0..e9088a5 100644 --- a/2018-11-27_Connecting_Javascript_to_Shiny/README.md +++ b/2018-11-27_Connecting_Javascript_to_Shiny/README.md @@ -1,6 +1,7 @@ -#Speaker: Alan D. +# Custom htmlwidgets: connecting Javascript to Shiny -##Title: Custom htmlwidgets: connecting Javascript to Shiny +* 2018-11-27 +* Speaker: Alan Dipert -###Abstract -The parts of Shiny that run in the browser are implemented in Javascript, and Shiny can be extended and enhanced in interesting ways by connecting Shiny to custom Javascript or Javascript libraries. Shiny supports 3 means of Javascript integration: custom inputs, custom outputs, and htmlwidgets. Of these, htmlwidgets are the most general and powerful, as they can work offline and embedded in RMarkdown. In this talk I'll give an overview of Shiny's relationship to Javascript, show how new htmlwidgets can be built using the React.js framework and integrated with Shiny, and share resources for doing front-end web development in a Shiny context. \ No newline at end of file +## Abstract +The parts of Shiny that run in the browser are implemented in Javascript, and Shiny can be extended and enhanced in interesting ways by connecting Shiny to custom Javascript or Javascript libraries. Shiny supports 3 means of Javascript integration: custom inputs, custom outputs, and htmlwidgets. Of these, htmlwidgets are the most general and powerful, as they can work offline and embedded in RMarkdown. In this talk I'll give an overview of Shiny's relationship to Javascript, show how new htmlwidgets can be built using the React.js framework and integrated with Shiny, and share resources for doing front-end web development in a Shiny context. diff --git a/2018-11-27_Data_Storytelling/README.md b/2018-11-27_Data_Storytelling/README.md index 86e7d8b..703d970 100644 --- a/2018-11-27_Data_Storytelling/README.md +++ b/2018-11-27_Data_Storytelling/README.md @@ -1,6 +1,7 @@ -#Speaker: Alyssa C +# Telling Meaningful Stories with Data -##Title: Telling Meaningful Stories with Data +* 2018-11-27 +* Speaker: Alyssa Columbus -###Abstract -According to Edward Tufte, an excellent data visualization expresses ‚ complex ideas communicated with clarity, precision and efficiency. Visualization is a dynamic form of persuasion, telling a story through the graphical depiction of statistical information. Few forms of communication are as persuasive as a compelling narrative. So how does a data scientist tell a meaningful story with a visualization? The analysis has to find the story that the data supports, and journalists have become very good at storytelling with visualization via infographics. In that vein, this presentation will share how some journalistic strategies on telling a good story can be applied to data visualization. \ No newline at end of file +## Abstract +According to Edward Tufte, an excellent data visualization expresses ‚ complex ideas communicated with clarity, precision and efficiency. Visualization is a dynamic form of persuasion, telling a story through the graphical depiction of statistical information. Few forms of communication are as persuasive as a compelling narrative. So how does a data scientist tell a meaningful story with a visualization? The analysis has to find the story that the data supports, and journalists have become very good at storytelling with visualization via infographics. In that vein, this presentation will share how some journalistic strategies on telling a good story can be applied to data visualization.