Load unstructured data to Watson Personality Insights (PI) by interacting with the API or via post Multipart/Form-data. Such an interaction will be posted on JSON format and then stored back into R on the same format as well. User will have the capabilities of reformatting the JSON file into a CSV and exporting such results for analysis.
The IBM Watson Personality Insights service uses linguistic analysis to extract cognitive and social characteristics from input text such as email, text messages, tweets, forum posts, and more. By deriving cognitive and social preferences, the service helps users to understand, connect to, and communicate with other people on a more personalized level.
- From R console download and install package from GitHub
install.packages("devtools")
devtools::install_github("blacknred0/rwatsonpi")
- Activate rWatsonPI package
library("rwatsonpi") #activate library
If you want to use curl natively on Linux or Mac, you are set. Now, if you want to use it on Windows, there is some configuration that needs to take place.
- Download curl zip
- Extract all file and folders
- Move parent folder (e.g. curl-7.27.0-rtmp-ssh2-ssl-sspi-zlib-idn-static-bin-w32) into a directory of your choice. (e.g. C:\curl\curl.exe)
- To run curl from the command line
- Right-hand-click on "My Computer" icon
- Select Properties
- Click "Advanced system settings" link
- Go to tab "Advanced" and click "Environment Variables"
- Under "System variables" select "Path" and click "Edit"
- Add a semicolon followed by the path to where you curl.exe is located (e.g. ;C:\curl)
- Now, Under "User variables for username" select "Path" and click "Edit"
- Add a semicolon followed by the path to where you curl.exe is located (e.g. ;C:\curl) Now you can run from the command line by typing:
curl www.google.com
You do not need to do this unless you want to use SSL. You do have the option to send data unencrypted.
Here are the steps to get it configured.
- Go to CA cert page
- Download "cacert.pem" by saving into your computer
- Move "cacert.pem" into "C:\curl" folder
Now that you have the environment configured, then you should be able to use getPI2() function from rWatsonPI. For more information on how to work with SSL certificates here is the direct link to the documentation.
Since there have been changes to Watson Personality Insights, this process/functions will be the default going forward.
- It is assumed that you already have some sort of BlueMix environment that you can use. If you do not have one, you will need to have one created/configured. Also, if you are on Windows you will need to configure cURL
- Make sure that you activate the package and necessary libraries
- Run main script
df <- read.table("sample.csv", sep=",", quote = "\"", header=TRUE, fill=FALSE) #import sample csv into R
df$text <- clnTxt(df$text) #clean records and replace field
selMeaningfulRecs(df, df$text) #select meaningful records based on criteria
fetch <- getPI2("https://gateway.watsonplatform.net/personality-insights/api/v2/profile", df.sel$text, usr="bunch-of-ugly-letters-and-numbers-from-blue-mix", pwd="something-really-hard-to-remember") #store PI JSON results
fj <- fmtJSON(fetch, df.sel$person) #attach identifier to results
df.sel.t <- data.frame(person=df.sel$person)
exportPI(df.sel.t, fj, "nameofcsv") #export to csv
Prior of new implementation of Watson Personality Insights, you were able to use forms. Now, if you would still like to use forms to post and retrieve your data, you might want to go to this commit.
- It is assumed that you already have some sort of BlueMix environment that you can use. If you do not have one, you will need to have one created/configured.
- Make sure that you activate the package and necessary libraries
- Run main script
df <- read.table("sample.csv", sep=",", quote = "\"", header=TRUE, fill=FALSE) #import sample csv into R
df$text <- clnTxt(df$text) #clean records and replace field
selMeaningfulRecs(df, df$text) #select meaningful records based on criteria
startPI("C:/watson-developer-cloudpersonality-insights-nodejs") #location where personality insights is located
fetch <- getPI("http://localhost:3000", df.sel$text) #store PI JSON results. For Windows, add parameter "win=TRUE"
fj <- fmtJSON(fetch, df.sel$person) #attach identifier to results
df.sel.t <- data.frame(person=df.sel$person)
exportPI(df.sel.t, fj, "nameofcsv") #export to csv
stopPI()
- It is assumed that you already have some sort of BlueMix environment that you can use. If you do not have one, you will need to have one created/configured.
- Make sure that you activate the package and necessary libraries
- Run main script
df <- read.table("sample.csv", sep=",", quote = "\"", header=TRUE, fill=FALSE) #import sample csv into R
df$text <- clnTxt(df$text) #clean records and replace field
selMeaningfulRecs(df, df$text) #select meaningful records based on criteria
fetch <- getPI("https://www.example.com", df.sel$text, ssl=TRUE) #store PI JSON results. transfer will be encrypted. if you do not want encryption, simply remove ssl. For Windows, add parameter "win=TRUE"
fj <- fmtJSON(fetch, df.sel$person) #attach identifier to results
df.sel.t <- data.frame(person=df.sel$person)
exportPI(df.sel.t, fj, "nameofcsv") #export to csv
- If for some reason the package is not activating properly, you can force the download and activation of all required packages by invoking the following function.
pkgHC() #download and install R packages
See TODO for list of enhancements that could be made.
This sample code is licensed under Apache 2.0. Full license text is available in LICENSE.
Find more open source projects on the IBM Github Page