For repositories on https://github.com/MicrosoftDocs. Turn your toc.yml file into a .CSV file for further work in Excel. Also add metadata from the files, currently ms.author
, ms.reviewer
, and ms.custom
.
Change myrepo
to point to the local directory that contains the toc.yml file you wish to inventory. Also change the filename if you wish.
The resulting .csv file will look like this:
In the spreadsheet, each level of the TOC hierarchy is shown in a column named n*. This lets you sort without losing context. Sort back to the original order from the first column.
The toc.yml file must contain these entries:
name
href
items
Optionally, the following entries may exist. They will be ignored, and not present in the spreadsheet.
displayName
(optional)expanded
(optional)
If your yaml file contains anything other than the above, the code as is would not work. New handling of these entries, in expandItems
(see below) would first be needed.
This script shows some examples of subsetting the initial data.frame by searching through metadata.
- The script is not set up to handle includes in the TOC.
- Metadata will be added only for files in the same directory as the TOC. Files in sub-directories will not be read. Use the function again with the sub-directory as input to get those contents.
- The current version may not work well with 1000s of entries. I've been using it for a directory with a little over 250 files, which takes around 6-7 seconds to complete.
Skip to next section if you're already using R.
-
- During install, copy the path where R will be installed
- When install is complete, edit your PATH system variable and add what you just copied, appending "\bin" to the end. For example, "C:\Program Files\R\R-4.1.2\bin" if your installation directory is C:\Program Files\R\R-4.1.2""
-
Clone this repo
git clone https://github.com/sdgilley/toc-to-csv.git --depth -1
-
Open a windows terminal as Admin
-
cd
to the cloned repo -
Run the
installs.R
program (You only need to do this once).rscript.exe installs.r
-
Edit the file
Create-CSV.R
in any text editor -
Change the path to your toc file at the top.
-
Change the name of the file to be written if you wish. It defaults to all-docs.csv.
-
Save the file.
-
In the terminal window, run the script:
rscript.exe Create-CSV-File.R
-
The file will be created in your current directory.
If you already have R installed, here's the steps to use the script in your IDE.
-
Clone this repo
git clone https://github.com/sdgilley/toc-to-csv.git --depth -1
-
Open it in your R IDE, such as RStudio
-
Open the file
installs.R
and run it to install packages -
Open the file
Create-CSV.R
-
Change the path to your toc file in
myrepo
-
Change the name of the file to be written if you wish
-
Run the entire file
-
The csv file will be created in your current directory.
The main script is Create-CSV.R. This is where you specify the path to your TOC file, and can change the name of the .csv file that is produced.
The heavy work occurs in the function createFinal
, which reads in the yml file and creates a data.table. The initial version of the data is an object contains three columns: name
, href
, and items
.
The function expandItems
is used to expand an items
list and merge in columns for the next level of the hierarchy. This continues until there are no more levels to expand.
Finally, if merge==T, the function getMetadata
is used to obtain metadata for each file in the directory. This is then merged with the data from the toc.
Some cleanup is performed and the merged file is returned. Create-CSV.R then writes this merged object as a csv file.
The script uses three functions. Each function is in its own file. The functions are:
-
createFinal(myrepo, merge)
-myrepo
is a path to the repo containing your files.- reads the toc.yml file in the myrepo path into an object
- calls
expandItems
to form the set of columns to describe each file. - if merge==T (default): calls
getMetadata
to add metadata. Set merge to F to skip adding metadata. - returns the final data.frame containing columns for each level of the TOC, filename, and metadata,
-
expandItems(dt)
-dt
is a data.table that contains a list column, nameditems
. This function:- expands the list into new columns
- deletes all columns except for the ones corresponding to name, href, and items
- merges these columns back to the data.table
- returns the data.table
-
getMetadata(path)
-path
is the path to the directory that contains your TOC. This function:- loops through the .md files in the input directory. (It does not traverse subdirectories.)
- extracts metadata from each file (currently,
ms.author
,ms.reviewer
,ms.custom
. It would be simple to modify this to add others if you wish) - returns a data.table that contains each filename and its metadata.