-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
18 changed files
with
103 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Categorical and Regression Trees (CART) | ||
|
||
* 2018-01-23 | ||
* Speaker: John Peach | ||
|
||
## Abstract | ||
This talk dives deep into the mathematical development of CART trees. It examines the | ||
assumptions and trade-offs that go into the development of the model and then | ||
how a tree is determined given a specific dataset. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,10 @@ | ||
##Title: Decision Trees with R | ||
# Using and Improving Decision Trees in R | ||
|
||
* 2018-01-23 | ||
* Speaker: Robert Mohr | ||
|
||
## Abstract | ||
This talk shows you how to use the various R packages to build trees and display | ||
them. It then presents a way to improve the formatting of the tree output. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,8 @@ | ||
#Speaker: Alfredo F. | ||
# Exploring Tree Models | ||
|
||
##Title: Exploring Tree Models | ||
* 2018-02-27 | ||
* Speaker: J. Alfredo Freites | ||
|
||
###Abstract | ||
## Abstract | ||
This talk covers how to build and test trees in R. It covers the splitting | ||
approach and walks through an example. |
9 changes: 5 additions & 4 deletions
9
2018-03-20_machine_learning_with_tree-based_models_in_r/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
#Speaker: Bella Feng | ||
# Machine Learning with Tree-Based Models in R | ||
|
||
##Title: Machine Learning with Tree-Based Models in R | ||
* 2018-03-20 | ||
* Speaker: Bella Feng | ||
|
||
###Abstract | ||
Bella's talk will summarize what she learned from this datacamp class, covering Classification & Regression Trees, Bagged Trees and Random Forests. Code for building, evaluating and tuning model parameters will be shared. If time permits, we will also look into strategy of getting into a kaggle competition with the different tools. | ||
## Abstract | ||
Bella's talk will summarize what she learned from this datacamp class, covering Classification & Regression Trees, Bagged Trees and Random Forests. Code for building, evaluating and tuning model parameters will be shared. If time permits, we will also look into strategy of getting into a kaggle competition with the different tools. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,9 @@ | ||
#Speaker: DF3 | ||
# Chapman University DataFest 2018 - Team DF3 | ||
|
||
##Title: DataFest 2018 | ||
* 2018-05-29 | ||
* Speaker: DataFest Team DF3 - Chapman University | ||
|
||
###Abstract | ||
|
||
## Abstract | ||
(from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." | ||
https://ww2.amstat.org/education/datafest/ | ||
https://ww2.amstat.org/education/datafest/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,9 @@ | ||
#Speaker: Data Dirtbags | ||
# Chapman University DataFest 2018 - Team Data Dirtbags | ||
|
||
##Title: DataFest 2018 | ||
* 2018-05-29 | ||
* Speaker: DataFest Team Data Dirtbags | ||
|
||
###Abstract | ||
|
||
## Abstract | ||
(from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." | ||
https://ww2.amstat.org/education/datafest/ | ||
https://ww2.amstat.org/education/datafest/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,9 @@ | ||
#Speaker: Seems Logit | ||
# Chapman University DataFest 2018 - Team Seems Logit | ||
|
||
##Title: DataFest 2018 | ||
* 2018-05-29 | ||
* Speaker: DataFest Team Seems Logit | ||
|
||
###Abstract | ||
|
||
## Abstract | ||
(from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." | ||
https://ww2.amstat.org/education/datafest/ | ||
https://ww2.amstat.org/education/datafest/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,9 @@ | ||
#Speaker: Team Epsilon | ||
# Chapman University DataFest 2018 - Team Epsilon | ||
|
||
##Title: DataFest 2018 | ||
* 2018-05-29 | ||
* Speaker: DataFest Team Epsilon | ||
|
||
###Abstract | ||
|
||
## Abstract | ||
(from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." | ||
https://ww2.amstat.org/education/datafest/ | ||
https://ww2.amstat.org/education/datafest/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,10 @@ | ||
#Speaker: The TerminatR | ||
# Chapman University DataFest 2018 - Team Terminal R | ||
## Maximizing Job Posting Success | ||
|
||
##Title: Maximizing Job Posting Success | ||
* 2018-05-29 | ||
* Speaker: DataFest Team Terminal R - CSU Fullerton | ||
|
||
###Abstract | ||
|
||
## Abstract | ||
(from the ASA website) "The American Statistical Association (ASA) DataFest is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set." | ||
https://ww2.amstat.org/education/datafest/ | ||
https://ww2.amstat.org/education/datafest/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
#Speaker: Gloria Gong | ||
# UCI MSBA Capstone Project | ||
## IBM Watson - The True Benefits of HR contracts | ||
|
||
##Title: IBM Watson - The True Benefits of HR contracts | ||
* 2018-07-31 | ||
* Speaker: Gloria Gong | ||
|
||
###Abstract | ||
Our team project is to help determine the true benefits costs in Human Resource contracts and we were working with IBM and the City of Los Angeles for our capstone. We used IBM Watson software (mainly Knowledge Studio and Natural Language Understanding) to conduct an analysis and our target data was the MOUs published on the City of LA website. Throughout the five-month project, we were given timetables in which we were to essentially learn the City MOUs breakdown and understand the MOUs. From the City’s part, questions/possibilities were brought up as in, what are the common and unique benefits for each MOU? Can Watson help identify more irrelevant information in City contracts, so they can be amended? Can we use the potential model to compare other City employment contracts/MOUs to their own? And lastly, how will the changes of benefit costs bring to social issues such as crime rate? Through creating dictionaries, applying pre-annotators, doing human annotating, training and evaluating the models, we achieved an overall model F1 score of 0.8. For detailed statistics, we got 0.5 in recall for ‘benefits’ and 0.26 for ‘eligibility’. Recalls for other entities are as high as above 0.7. We helped the City of LA in saving contract processing time and the model was able to analyze contract (average 19,000 words) within 10mins. | ||
## Abstract | ||
Our team project is to help determine the true benefits costs in Human Resource contracts and we were working with IBM and the City of Los Angeles for our capstone. We used IBM Watson software (mainly Knowledge Studio and Natural Language Understanding) to conduct an analysis and our target data was the MOUs published on the City of LA website. Throughout the five-month project, we were given timetables in which we were to essentially learn the City MOUs breakdown and understand the MOUs. From the City’s part, questions/possibilities were brought up as in, what are the common and unique benefits for each MOU? Can Watson help identify more irrelevant information in City contracts, so they can be amended? Can we use the potential model to compare other City employment contracts/MOUs to their own? And lastly, how will the changes of benefit costs bring to social issues such as crime rate? Through creating dictionaries, applying pre-annotators, doing human annotating, training and evaluating the models, we achieved an overall model F1 score of 0.8. For detailed statistics, we got 0.5 in recall for ‘benefits’ and 0.26 for ‘eligibility’. Recalls for other entities are as high as above 0.7. We helped the City of LA in saving contract processing time and the model was able to analyze contract (average 19,000 words) within 10mins. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
#Speaker: Wanyi Huang | ||
# UCI MSBA Capstone Project | ||
## Pacific Life's Deferred Annuity | ||
|
||
##Title: Pacific Life's Deferred Annuity | ||
* 2018-07-31 | ||
* Speaker: Wanyi Huang | ||
|
||
###Abstract | ||
## Abstract | ||
Our most essential task in this project is to review the mortality experience from Pacific Life. In particular, we are looking into the mortality from Pacific Life's Deferred Annuity block of business. A Deferred Annuity is a retirement investment vehicle and functions like a mutual fund if managed efficiently. Pacific Life is internally required to use their current model to run periodic projections for the purposes of hedging, pricing and various other activities. Our goal is to identify and analyze the policy attributes that led to the disparity from the model and improve the model according to the results of the analysis and potentially provide business intelligence regarding mortality that can support marketing and operational strategies. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
#Speaker: Yemi Odeyemi (Ph.D. Candidate in Data Science at Chapman University) | ||
# Classification and Statistical Analysis of Cancer Mutations Scores | ||
|
||
##Title: Classification and Statistical Analysis of Cancer Mutations Scores | ||
* 2018-08-28 | ||
* Speaker: Yemi Odeyemi (Ph.D. Candidate in Data Science at Chapman University) | ||
|
||
###Abstract | ||
The talk will describe part of Yemi's doctoral work on building a statistical and predictive model to classify driver-passenger mutations. A Logit model is used with 10-fold cross-validation. The data was preprocessed to impute missing values using the rule-of-thumb approach, removal of redundant features and feature scaling. Feature selection was determined using a stepwise approach based on AIC. The objective was to determine the optimal class boundary for the probability for discretization. The models were evaluated with Receiver Operator Characteristics - Area under the curve (ROC-AUC) which is based on sensitivity and specificity. | ||
## Abstract | ||
The talk will describe part of Yemi's doctoral work on building a statistical and predictive model to classify driver-passenger mutations. A Logit model is used with 10-fold cross-validation. The data was preprocessed to impute missing values using the rule-of-thumb approach, removal of redundant features and feature scaling. Feature selection was determined using a stepwise approach based on AIC. The objective was to determine the optimal class boundary for the probability for discretization. The models were evaluated with Receiver Operator Characteristics - Area under the curve (ROC-AUC) which is based on sensitivity and specificity. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
#Speaker: Ryan Benz (SoCal Bioinformatics, Inc.) | ||
# Tune up your RStudio Experience | ||
|
||
##Title: Tune up your RStudio Experience | ||
* 2018-08-28 | ||
* Speaker: Ryan Benz (SoCal Bioinformatics, Inc.) | ||
|
||
###Abstract | ||
In this talk, Ryan will discuss some of the ways you can tune-up your RStudio experience to make it easier to work with and to stream line commonly performed tasks to support your coding and analysis work. | ||
## Abstract | ||
In this talk, Ryan will discuss some of the ways you can tune-up your RStudio experience to make it easier to work with and to stream line commonly performed tasks to support your coding and analysis work. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
#Speaker: Emil Hvitfeldt | ||
# Similarity measure in the space of color palettes | ||
|
||
##Title: Similarity measure in the space of color palettes | ||
* 2018-09-25 | ||
* Speaker: Emil Hvitfeldt | ||
|
||
###Abstract | ||
## Abstract | ||
Related to my project of creating a catalog of all available color palettes in r https://github.com/EmilHvitfeldt/ r-color-palettes and its associated r package https://CRAN.R-project.org/package=paletteer I wanted to expand the project to support a higher degree of explorability. There is already quite a bit theory of color similarity and image similarity that will provide useful but unfortunately insufficient. For standard color similarity using some kind of space measure in a perceptual color space will likely give you what you need, but this approach will start to struggle when you need to compare groups of colors since ordering starts making a difference. Image similarity can likewise be done by comparing color histograms, this approach does still not capture the number of colors or other qualities such as classifying according to type (Sequential, Diverging or Qualitative). My goal is to use expert knowledge to calculate features that can be used to calculates similarities |
9 changes: 5 additions & 4 deletions
9
...s_Leverage_Web_Service_APIs_to_support_their_Data_Science_initiatives/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,8 @@ | ||
#Speaker: Robert Thomas | ||
# How Marketing Teams Leverage Web Service API’s to support their Data Science initiatives. | ||
|
||
##Title: How Marketing Teams Leverage Web Service API’s to support their Data Science initiatives. | ||
* 2018-10-30 | ||
* Speaker: Robert Thomas | ||
|
||
###Abstract: | ||
## Abstract: | ||
|
||
Examine how Marketing teams leverage Alteryx & R Programming Web Service API’s to support their data science projects. Additionally, emphasize how R Studio web services API’s moves data into a data warehouse to solve marketing analytics problems. | ||
Examine how Marketing teams leverage Alteryx & R Programming Web Service API’s to support their data science projects. Additionally, emphasize how R Studio web services API’s moves data into a data warehouse to solve marketing analytics problems. |
7 changes: 4 additions & 3 deletions
7
2018-10-30_Using_R_to_build_large-scale_models_with_Alteryx/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
#Speaker: Alan D. | ||
# Custom htmlwidgets: connecting Javascript to Shiny | ||
|
||
##Title: Custom htmlwidgets: connecting Javascript to Shiny | ||
* 2018-11-27 | ||
* Speaker: Alan Dipert | ||
|
||
###Abstract | ||
The parts of Shiny that run in the browser are implemented in Javascript, and Shiny can be extended and enhanced in interesting ways by connecting Shiny to custom Javascript or Javascript libraries. Shiny supports 3 means of Javascript integration: custom inputs, custom outputs, and htmlwidgets. Of these, htmlwidgets are the most general and powerful, as they can work offline and embedded in RMarkdown. In this talk I'll give an overview of Shiny's relationship to Javascript, show how new htmlwidgets can be built using the React.js framework and integrated with Shiny, and share resources for doing front-end web development in a Shiny context. | ||
## Abstract | ||
The parts of Shiny that run in the browser are implemented in Javascript, and Shiny can be extended and enhanced in interesting ways by connecting Shiny to custom Javascript or Javascript libraries. Shiny supports 3 means of Javascript integration: custom inputs, custom outputs, and htmlwidgets. Of these, htmlwidgets are the most general and powerful, as they can work offline and embedded in RMarkdown. In this talk I'll give an overview of Shiny's relationship to Javascript, show how new htmlwidgets can be built using the React.js framework and integrated with Shiny, and share resources for doing front-end web development in a Shiny context. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
#Speaker: Alyssa C | ||
# Telling Meaningful Stories with Data | ||
|
||
##Title: Telling Meaningful Stories with Data | ||
* 2018-11-27 | ||
* Speaker: Alyssa Columbus | ||
|
||
###Abstract | ||
According to Edward Tufte, an excellent data visualization expresses ‚ complex ideas communicated with clarity, precision and efficiency. Visualization is a dynamic form of persuasion, telling a story through the graphical depiction of statistical information. Few forms of communication are as persuasive as a compelling narrative. So how does a data scientist tell a meaningful story with a visualization? The analysis has to find the story that the data supports, and journalists have become very good at storytelling with visualization via infographics. In that vein, this presentation will share how some journalistic strategies on telling a good story can be applied to data visualization. | ||
## Abstract | ||
According to Edward Tufte, an excellent data visualization expresses ‚ complex ideas communicated with clarity, precision and efficiency. Visualization is a dynamic form of persuasion, telling a story through the graphical depiction of statistical information. Few forms of communication are as persuasive as a compelling narrative. So how does a data scientist tell a meaningful story with a visualization? The analysis has to find the story that the data supports, and journalists have become very good at storytelling with visualization via infographics. In that vein, this presentation will share how some journalistic strategies on telling a good story can be applied to data visualization. |