reddit | reddit scraper | | Search

The testScraper function, exported as a module, scrapes Reddit links using a Selenium client instance, which is obtained by calling the getClient function, and returns the scraped results. The function takes an optional startPage parameter, defaulting to a specific Reddit URL, and modifies it if necessary to include a protocol.

Run example

npm run import -- "test reddit scraper"

test reddit scraper

const redditLinks = importer.import("reddit scraper")
const getClient = importer.import("selenium client")

async function testScraper(startPage = 'https://www.reddit.com/r/CollapseSupport+climatechange+collapse+economicCollapse/') {
  if(!startPage.includes('://')) {
    startPage = 'https://www.reddit.com/r/' + startPage
  }

  driver = await getClient()

  let result = await redditLinks(driver, startPage)

  driver.quit()

  return result
}


module.exports = testScraper

What the code could have been:

const { Import } = require('./importer');
const { Client } = require('./selenium-client');

/**
 * Tests a Reddit scraper by navigating to the specified subreddit and scraping links.
 * 
 * @param {string} startPage - The subreddit to scrape. Defaults to 'CollapseSupport+climatechange+collapse+economicCollapse'.
 * @returns {Promise<object>} The scraped Reddit links.
 */
async function testScraper(startPage = 'CollapseSupport+climatechange+collapse+economicCollapse') {
  const basePage = 'https://www.reddit.com/r/';
  const fullStartPage = startPage.includes('://')? startPage : basePage + startPage;
  
  // Initialize the Selenium driver.
  const driver = await Client.getInstance();

  try {
    // Scrape the Reddit links.
    const result = await Import.getRedditLinks(driver, fullStartPage);
    
    return result;
  } finally {
    // Quit the driver to free up resources.
    await driver.quit();
  }
}

module.exports = testScraper;

Code Breakdown

Importing Dependencies

const redditLinks = importer.import('reddit scraper')
const getClient = importer.import('selenium client')

testScraper Function

async function testScraper(startPage = 'https://www.reddit.com/r/CollapseSupport+climatechange+collapse+economicCollapse/') {
 ...
}

URL Validation and Modification

if(!startPage.includes('://')) {
  startPage = 'https://www.reddit.com/r/' + startPage
}

Initializing Selenium Client

driver = await getClient()

Scraping Reddit Links

let result = await redditLinks(driver, startPage)

Quiting Selenium Client

driver.quit()

Returning Result

return result

Exporting Function

module.exports = testScraper