Skip to content

Commit

Permalink
added Simple Word Frequency in Excel and renamed from Data_integratio…
Browse files Browse the repository at this point in the history
…n_tips.md
  • Loading branch information
artmg committed Jul 20, 2023
1 parent f4b4a20 commit be31feb
Showing 1 changed file with 26 additions and 0 deletions.
26 changes: 26 additions & 0 deletions Data_integration_tips.md → Data Analysis.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@

# Integration

## Peak email output periods

Sending emails is not always a sign of productivity,
Expand All @@ -18,6 +20,7 @@ using Power Query, and filter the emails from the Sent Items folder.
* Filters: Folder Path
* Rows: DateTimeSent

## Chunking

### Group data into time blocks

Expand All @@ -40,3 +43,26 @@ are outlined clearly in the the article https://www.excelcampus.com/charts/group
* Add Column / Custom Column


## Anonymisation

See [Lubuild / data-extraction](https://github.com/artmg/lubuild/blob/master/help/manipulate/data-extraction.md#anonymisation)

## Frequency

### Simple Word Frequency in Excel 

Takes a column of cells each containing a short line of text and returns a frequency count in descending order

- Row 1 column headings 
- Text Words Frequency Range 
- Range formula in D2:  
- a2:a1000 
- Words formula in B2: 
- =UNIQUE(TOCOL(IFERROR(REDUCE(,INDIRECT($D$2),LAMBDA(x,y,VSTACK(TEXTSPLIT(x, " "),TEXTSPLIT(y, " ")))),"-"))) 
- Frequency formula in C2: 
- =SUM((LEN(INDIRECT($D$2))-LEN(SUBSTITUTE(INDIRECT($D$2),B2,"")))/LEN(B2)) 
- Copy the C cells down 
- Credit for technique to: [https://www.get-digital-help.com/excel-udf-word-frequency/](https://www.get-digital-help.com/excel-udf-word-frequency/)  
- Copy Paste Special Text Only into separate sheet and Sort 

This is a ‘clever’ technique, but as with many Spreadsheet ‘coding via formula’ solutions, very inefficient and it begins to fail once you get over several thousand cells in your range.

0 comments on commit be31feb

Please sign in to comment.