7/25/2021 William Craig - Admin@wrcraig.com https://wrcraig.com
Objective of these scripts:
Warranty:
None whatsoever. Use at your own risk. Always back up your system and data, and experiment using a test area. Works on my system using Ubuntu 20.04 or 18.04 with Apache2
Table of Contents
Directory Structure Of Data Files To Be Indexed:
Using .htaccess and multiple index files on a website
Samples of description lines for conversion to index files
######################################### Installation Extract all the files in the zip file archive into a single directory. Open a terminal and change to that directory, then enter "bash menu.bash".
The rar file contains the following files:
1. ReplSpacesinSubdirsFilenames.bash 2. Add-eqsign-ToDirectoryNames.bash 3. UpdateOwnerAndPermissions.bash 4. UpdateOwnerAndPermissionsRecursively.bash 5. htaccess-config.bash 6. functions.bash 7. htaccess2index.bash 8. Run-Htaccess2Index-Recursively.bash 9. enterdescriptions.xls 10. menu.bash 11. README.html 12. Sample.of.index.html.output.png #########################################
1. README.txt
    This file.2. menu.bash
    Convenient system to access/invoke the scripts that generate web page index files3. htaccess-config.bash
    This is the config file for variables used in the following four bash scripts.ALL SCRIPTS MUST RESIDE IN THE SAME DIRECTORY
3a. Run-Htaccess2Index-Recursively.bash
    This script acts in concert with the next script and recursively parse all subdirectories of the directory you specify in the config file.3b. htaccess2index.bash
    This script creates index.html and index2.html in the directories you specify in the config file.3c. UpdateOwnerAndPermissionsRecursively.bash
    This script acts in concert with the next script to recursively parse all subdirectories of the directory you specify in the config file.3d. UpdateOwnerAndPermissions.bash
    Updates file and directory ownerships and permissions in the directories specified above.
---- Other files to make your life easier ----
4. ReplSpacesinSubdirsFilenames.bash (Recursive action)
    Replaces spaces in filenames with a dot for those applications which fail if the filename or directory name has spaces.5. Add-eqsign-ToDirectoryNames.bash (Recursive action)
    Adds an equal sign to directory names to avoid conflict with similar file names     (Run number 4 first, this fails if there are any spaces in the directory name)6. enterdescriptions.xls
    Spreadsheet (.xls) to make entry and formatting of description data easier.           Sample of one line of the final output of the description data:     AddDescription "<b>1931, IMDB 7.8, <a href='https://www.imdb.com/video/vi1168638233?playlistId=tt0021884&ref_=tt_ov_vi'>Trailer</a> </b>Henry Frankenstein is a doctor who is trying to discover a way to make the dead walk. He succeeds and creates a monster that has to deal with living again.<hr>" Frankenstein.1931.720p.BluRay.H264.AAC-RARBG.mp47. Sample.of.index.html.output.png
    Like the file name says.8. functions.bash
    A collection of bash functions common to all the scripts.#########################################
Using the menu----------------
Menu item 1----------------
Menu item 2----------------
Menu item 3----------------
Menu item 4----------------
Menu item 5----------------
Menu item 6----------------
Menu item 7----------------
Menu item 8    -UpdateOwnerAndPermissionsRecursively.bash calls UpdateOwnerAndPermissions.bash to run in each subdirectory.
    -You may direct in the config file whether or not the scripts will run in the BaseDirectory defined in the config file in addition to the default of all of it's subdirectories.
    -MUST RUN AS A SUPERUSER/Root. You will be asked to enter an administrator or sudoer password.
----------------
Menu item 9    -It then produces html download links to the files and combines the description data from .htaccess (or similar) with the user's headers and footers into files named "index.html" and "index2.html", the first sorted by filename and the second sorted by description. Styles are not used, just plain old html.
    -Once the index files are created you may disable Apache's directory views as a security measure (Options -Indexes). This will help prevent evildoers from browsing freely where there are no index files.
----------------
Menu item 10
#########################################
1. Uses the bash scripting language included in most Linux flavors
    1a. Linux bash scripts can now be run on Windows 10, see: https://www.howtogeek.com/249966/how-to-install-and-use-the-linux-bash-shell-on-windows-10/   -I have not verified this as so.
2. Uses "mediainfo" to determine the play-length of any audio or video files (sudo apt install mediainfo). If mediainfo is not installed the duration of media files will not be included in the indexes.
3. ASSUMES that there are no spaces in the filenames or directories. You should use menu item 4 to recursively replace any spaces with dots in the source and sub directories and all filenames.
4. Be sure to differentiate directory names from partial filenames. You can ensure this by adding a tag to the end of directory names. Menu item #4 will make mass changes to append an equals sign to all directory names from subdirectory level 1 or 2 to the lowest subdirectories. If needed, you will have to manually add an equals sign to the BaseDirectory (the highest level)
###############################################
Directory Structure Of Data Files To Be Indexed:
>>> BaseDirectory specified in the config file | | |--->> Level 1 Subdirectory | | | |---> Level 2 Subdirectory | | | |---> Level 3 Subdirectory | | | Etc. | | |--->> Another Level 1 Subdirectory | Etc.
#########################################
Using .htaccess and multiple index files on a website
-- Why do we want and index file and/or an .htaccess file on a webserver?
Without an index file, or a properly configured .htaccess file, browsers might be able to freely browse any directory on your server. That is not an optimum way to run a server.
.htaccess is a file that the Apache webserver can use to control the access and display of all directories that do not have a file named index.xxx (where xxx can be htm, html, php or other apache2-acceptable extension).
An index file uses HTML coding to display a more presentable and flexible web page than the .htaccess file.
You can also have both files. The hidden .htaccess file (a hidden file is indicated by the leading dot) can act as a backup to any missing index file just in case some evildoer finds a way to get around your carefully crafted net of web pages with index files.
#######################################
Below is an example of an .htaccess file which, if other configuration options are completed, will frustrate most evildoers and require a userid and password to open the directory (even if you have an index file):
AuthName "Restricted Area"
AuthType Basic
# REQUIRE AN ID AND PASSWORD TO OPEN THIS AND ALL SUBDIRECTORIES (see the Apache2 documentation for details)
AuthUserFile /xxxx/xxx/.htpasswd
require valid-user
# STRONG HTACCESS PROTECTION
<Files ~ "^.*\.([Hh][Tt][Aa])">
order allow,deny
deny from all
</Files>
# Deny access to evil robots site rippers, offline browsers, and other nasty scum
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]
# Disable Directory views. Absence of index.html in a directory will now result in a "Forbidden" message:
Options -Indexes
# Make some file extensions download only, not in-line play:
<FilesMatch "\.(mov|mp3|mp4|jpg|pdf|mkv)$">
ForceType application/octet-stream
</FilesMatch>
# <IfModule mod_headers.c> Header set X-XSS-Protection "1; mode=block" Header set X-Frame-Options "SAMEORIGIN" Header set X-Content-Type-Options "nosniff" Header always set Strict-Transport-Security "max-age=63072000; includeSubDomains" # Header set Content-Security-Policy ... (This is one you REALLY need to study to complete) Header set Referrer-Policy "same-origin" Header set Feature-Policy "geolocation 'self'; vibrate 'none'" # </IfModule> # This code may be added to your site via .htaccess or Apache config. # If used in .htaccess, leave the if statement commented out or deleted. # If used in Apache2 config, uncomment the two lines: <IfModule mod_headers.c> and: </IfModule>Understand that this technique includes commonly used configurations for each of the included headers. You can (and should) go through each one to make sure that the configuration matches the requirements and goals of your site. Also remember to test thoroughly before going live.
(Descriptions of files can be entered in the .htaccess file at this location in the format shown below, but Apache needs to be told to use FancyIndex displays.
For better and easier directory listings we recommend using a separate index.html file as created and processed by these scripts)
#########################################
* A spreadsheet is included to help create file descriptions in the proper format which can be copied into .htaccess, or used by these scripts to create the index files.
* Since double quotes are required around the file description, any additional double quotes within the description may cause very unexpected results.
****** If NOT using the spreadsheet to create the description file
#### Simplest template for file descriptions #####
(note the quotes and the two spaces required; one after AddDescription and another before the file.name:AddDescription "This is the actual description portion, may include spaces and ht ml." filename_cannot_include.spaces
##### Normal files example #####
AddDescription "<b>My favorite book- </b>How to pack a picnic lunch without a basket.<hr>" Picnic.txt
##### Audio or Video files examples #####
# Note the equals sign at the end of a directory name to differentiate from partial file names
AddDescription "<b>2020, IMDB 6.4, Drama, Horror, Sci-Fi <a href='URLofVideoTrailer'>Trailer</a> </b>At the height of ... it becomes clear that ...<hr>" movie.mp4
AddDescription "<b>1931, IMDB 7.8, <a href='https://www.imdb.com/video/vi1168638233?playlistId=tt0021884&ref_=tt_ov_vi'>Trailer</a> </b>Henry Frankenstein is a doctor who is trying to discover a way to make the dead walk. He succeeds and creates a monster that has to deal with living again.<hr>" Frankenstein.1931.720p.BluRay.H264.AAC-RARBG.mp4
AddDescription "<b>2017, TV Series, </b>A series about...<hr>" Name.of.Directory=
AddDescription "<b>2018, IMDB 6.1, </b>Three schoolgirls and their governesses mysteriously disappear on Valentines Day in 1900.<hr>" Picnic.Directory=
AddDescription "<b>2014, ArtistName </b>anything you want to describe the music file <hr>" MusicFileName.mp3
#Notice that index.html will be sorted on the file names and index2 will be sorted on descriptions
#######################################
Once your system is configured and working to your satisfaction you may automate the update of owner and permissions (UpdateOwnerAndPermissionsRecursively.bash) and the generation of index files (Run-Htaccess2index-Recursively.bash).
UpdateOwnerAndPermissionsRecursively.bash must be run as a root cron job, while Run-Htaccess2index-Recursively.bash must be run as a user cron job.