Python Examples
Convert IP to Integer and Integer To IP in Python How to Get IPv4/IPv6 Address Range from CIDR in Python? Compare Two Objects For Equality in Python How to find Duplicate Elements from a List in Python Convert Timestamp to datetime in Python Convert datetime to Timestamp in Python Generate Random String of Specific Length in Python Encryption and Decryption of Strings in Python The string module in Python Convert string to bytes in Python Convert bytes to string in Python Convert string to datetime and datetime to string in Python Call a function Asynchronously from within a Loop and wait for the Results in Python Remove Duplicate Elements from a List in Python Caching in Python with Examples How to Bulk Insert and Retrieve Data from Redis in Python How to Write Unit Test in Python Read and Write CSV Files in Python Read and Write Data to a Text File in Python How to Convert CSV to JSON in Python Create ICS Calendar File in Python Install Python on Windows 10/11 Install Python on Ubuntu 20.04 or 22.04.3 Python - Install Virtual Environment How to Find a Specific Field Value from a JSON list in Python Download and Unzip a Zipped File in Python Python Install PIP Python Install Virtual Environment How to Fix Python Error: string argument without an encoding Compare Two JSON files in Python How to Hash a Dictionary Object in Python? Create a Digital Clock in Python Create Multiple URLs Using Each Path of a URL in Python Send an Email with Multiple Attachments using Amazon SES in Python SQLAlchemy Query Examples for Effective Database Management SQLAlchemy Query to Find IP Addresses from an IP Range in Bulk How to Create and Use Configuration files in a Python Project Check if a Value Already Exists in a List of Dictionary Objects in Python How to Split Large Files by size in Python? Fixing - Running Scripts is Disabled on this System Error on Windows Generating QR Codes in Python Reading QR Codes in Python

Create Multiple URLs Using Each Path of a URL in Python

  • Last updated Apr 25, 2024

Before we start creating multiple URLs using each path of a URL, let's first understand the concept of URL paths. The path of a URL is the part of the web address that follows the domain. For example, in the URL https://example.com/blog/post-1, the path is /blog/post-1.

Hers's an example code that takes an input URL, splits its path, and generates a list of related URLs based on the hierarchical structure of the path:

from urllib.parse import urlparse

def create_multiple_urls_from_url_path(url):
    results = []
    base_url = f"{urlparse(url).scheme}://{urlparse(url).netloc}"
    path = urlparse(url).path
    dirs_position = [pos for pos, char in enumerate(path) if char == "/"]
    for i in dirs_position:
        results.append(base_url+path[0:i+1])
    return results


url_list = create_multiple_urls_from_url_path("https://www.example/home/dashboard/profile/index.php")
print(url_list)

In this example, the code begins by importing the urlparse function from the urllib.parse module. Next, a custom function named create_multiple_urls_from_url_path is defined. This function takes an input URL as its parameter and is designed to return a list of related URLs. Inside the function, an empty list called results is initialized. This list will be used to store the generated URLs. The input URL is then divided into its components: the base URL, which includes the scheme and netloc, and the path. These components are extracted using the urlparse function. The code proceeds to identify the positions of forward slashes within the path, which signify different directory levels. This is achieved by creating a list called dirs_position using list comprehension. For each position of a forward slash in the path, the script constructs a new URL by combining the base URL with a portion of the path. These newly generated URLs are appended to the results list. The function ultimately returns this list of generated URLs to the caller. In practical terms, the code is useful for web scraping or navigation tasks, as it allows you to explore and access different levels of a website's content. The generated list of URLs can be used for data retrieval, site structure analysis, and more.

The output of the above code is as follows:

['https://www.example/', 'https://www.example/home/', 'https://www.example/home/dashboard/', 'https://www.example/home/dashboard/profile/']