- Instead of "DirectoryToCSV" call this "MDReports" or "DirReports" or "MDStats"or "FlatStats"
- Realizations - Notion is nice because you habe tables and text ... markdown docs are limited to mostly text but this atleast gives powerful survery-of-your text capabilities
- AUTHOR REPORTS -- by RK, by Tim, by Jane ... if Authors name stored in frontmatter could do this
- Word count - added a Wordcount feature but this is best with just a report on PAGES instead of LINKS
- Backlinks - Could calculate this ... find ones without any
- TAG REPORTS and sorted - Also from frontamtterYou'r
- JSON Export as well ...
This script will:
-
Define configurable parameters:
$rootDirectory
: Directory to scan for markdown files$orderBy
: Sort order ("default", "creation_newest", "modified_newest", or "domain")$linkType
: Type of links to extract ("internal" or "external")
-
Create a CSV file with appropriate headers
-
Recursively scan for all markdown files (
.md
and.markdown
extensions) -
For each markdown file:
- Extract links based on
$linkType
:- External: Both markdown-style links and plain URLs starting with http(s)
- Internal: Only markdown-style links to local files
- Extract root domains from URLs (for external links)
- Get accurate file creation dates (specifically for macOS)
- Get file modification dates
- Store all information for sorting
- Extract links based on
-
Sort the collected data based on
$orderBy
:- "default": Original scan order
- "creation_newest": Newest files first
- "modified_newest": Most recently modified first
- "domain": Alphabetically by root domain
- Domain: Root domain of the URL (for external links)
- File: Just the filename
- URL: The complete extracted URL
- Link Name: The text of the markdown link (if different from URL)
- Source File: Full path to the file where the link was found
- Creation Date: Accurate creation date of the source file
- Last Modified Date: Last modification date of the source file
- Save as
Run.php
- Configure parameters:
$rootDirectory = __DIR__ . '/docs'; $orderBy = "default"; // or "creation_newest", "modified_newest", "domain" $linkType = "external"; // or "internal"
- Run:
php Run.php
- Creates
/docs
directory if it doesn't exist - Uses
stat
command on macOS for accurate creation dates - Groups links by domain when using domain sorting
- Filters internal/external links based on
$linkType
- Outputs to
extracted_links.csv
in the same directory as the script