
The Larger-scale link Checker script for Google Ads manager accounts extends the single account Link Checker script to run for multiple accounts under a manager account.
As a website evolves, new pages get added, old pages are taken down, and links get broken and fixed. Keeping a Google Ads campaign in sync with the website is an ongoing battle for many advertisers. Live advertisements may be pointing to non-existent pages, and the advertiser ends up paying for clicks that yield 404 errors. The first step in managing "zombie" links is to identify them.
The link checker solution is a Google Ads script for validating the existence of landing pages. However, for large accounts the solution has the following issues:
- Large numbers of URLs can exceed the daily quotas for fetching URLs.
- Only one URL can be checked at a time, and the script has an execution time limit. This limits the rate at which URLs can be checked.
The Larger-scale link checker addresses these issues by using Google Cloud infrastructure to assist with the link checking. The task is divided as follows:
- The Google Ads script collates the list of URLs, passes them to App Engine, running on Google Cloud infrastructure, then periodically checks on progress by means of script scheduling.
- The App Engine instance performs the link checking, and is able to check links in parallel, as well as having considerably higher quotas tha Google Ads scripts for performing URL fetches.
Tip: Free App Engine instances are able to fetch a much greater number of URLs per day tha Google Ads Scripts, but are still subject to a quota. However, this quota can be increased much further still by enabling billing.
Configuration options
The script's main options can be set in the spreadsheet.

-
Scope: Select whether the script will check ads, keywords, and/or
sitelinks, and whether the script will check them even if they are paused. Most users
should include all three to ensure all of their URLs are checked, but it is typically
not necessary to check paused entities.
Tip: When working with large accounts, set this to a minimal settings, e.g. just sitelinks. Once you have confirmed your setup is working you can then adjust to include keywords and ads as necessary.
Failure strings: An optional list of strings which should also be considered as a "failure". For example, a page may exist yet contain the text "Out of stock". You may wish to highlight landing pages that point to products that are out of stock, for example.
Note: When failure strings are specified, the link checker must request the full text of a page (GET request), as opposed to merely checking its existence (HEAD request). This will use up more of the App Engine quota, as well as taking longer.
- Email at start of analysis: If you enable this option, the script will email you when each new analysis starts.
- Email after finishing entire analysis: If you enable this option, the script will email you a consolidated summary after it finishes checking every URL.
- Email even if no errors are found: If you enable this option, the script will email you (based on the two options above) even if it did not find any errors. Most users prefer to be emailed only when there are errors, but receiving an email even if there are no errors can be useful way to ensure the script is running as scheduled.
- Days between analyses: Use this option to control how often the script starts a fresh analysis of all of your URLs.
Scheduling
Each time the script runs, it automatically detects whether it should resume an analysis already in progress or whether the last analysis finished and it is time to start a fresh one (based on the Days between analyses option). As a result, regardless of how often you want to launch a fresh analysis, schedule the script to run Hourly.
How it works
The script uses Google Ads Reports to build up a list of URLs to check. As Manager Account scripts can only process 50 accounts in parallel, the script gathers URLs across several executions until all accounts have been inspected. These are then passed to the App Engine application, which performs the hard work of checking each URL.
The script polls the App Engine application each time it runs to determine whether there are any results to collect. Once all URLs have been checked by the App Engine application, the script writes the results back to the Results tab.
Frequently asked questions
- Can the script write successful URLs to the results too?
- No, as this solution is designed to work with a large number of URLs, if a row is created for each URL, regardless of success or failure, this may hit the limit for Spreadsheet rows. Working on the assumption that the majority of landing pages will not return an error, the solution only creates rows for those that result in failure. However, a count of the total URLs checked is included.
- Will the script pause ad groups and campaigns with 404 links?
- No, the script will only identify those with broken links.
Configuration
Configuration requires two steps:
- Setting up the Google Cloud project for the App Engine instance.
- Configuring the link checker options.
- Copy the template Sheets document.
- Don't forget to update
CONFIG_SPREADSHEET_URL
in code. - Schedule the script Hourly.
Cloud setup
Setting up the Cloud aspects of the solution takes a few minutes, but only needs to be done once:
- Select the Cloud setup tab.
- Follow the steps in each box, in turn. It make take a minute or two for each step to take effect, so if an error is returned from one, wait a minute or so, before trying again.
Cloud performance settings
The larger-scale link checker is able to check URLs both:
- In parallel, using multiple tasks at the same time
- At a configurable rate, e.g. the number of URLs per minute checked by each task
As a consequence, the checker may then be blocked temporarily from checking further pages, and the results may be inaccurate. To configure parallelization and checking rate, use the App Engine Performance tab in your configuration spreadsheet:
- Set Max number of concurrent URL checking tasks to the degree of parallelization desired.
- Set Max number of URLs to check per minute, in each task to the maximum number of URLs each parallel task can check per minute.
- Set User Agent to a custom user agent, if you wish to use this in identifying traffic on your server that originates from the link checker.
- Click update to apply these new values.
App Engine project
If you wish to extend the capabilities of this project, the App Engine code can be found on GitHub. This is not required for normal use of the solution, but may be useful for advanced users wishing to take the solution further.
Source code
// Copyright 2017, Google Inc. All Rights Reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. /** * @name App Engine-based link checker - Manager Account version * * @overview The App Engine-based link checker delegates link checking to an * App Engine application in the Cloud. This enables links to be checked at * a greater rate and volume than the purely Google Ads scripts-based * solution. * * @author Google Ads Scripts Team [adwords-scripts@googlegroups.com] * * @version 1.0.6 * * @changelog * - version 1.0.6 * - Change link to new spreadsheet template. * - version 1.0.5 * - Simplify use of shared key and remove reliance on OAuth. * - version 1.0.4 * - Add checks to ensure more robust creation and handling of Drive files. * - version 1.0.3 * - Fix bug in keyword Final URL gathering. * - version 1.0.2 * - Add more robust checking for sitelink extraction from reports. * - version 1.0.1 * - Fix handling of accounts with no serving campaigns. * - version 1.0 * - Released initial version. */ // This configuration spreadsheet should be a copy of : // https://docs.google.com/spreadsheets/d/1sX0DIY7NHbDvAYXN9TuBNL5Vhq3F6uMJDmkgZwSY6RI/copy var CONFIG_SPREADSHEET_URL = 'ENTER_SPREADSHEET_URL'; // The values below here are not intended to be user-changeable. var FIFTEEN_SECS_MILLIS = 15000; var MILLIS_PER_DAY = 24 * 60 * 60 * 1000; var URL_BATCH_SIZE = 15000; var LINK_CHECKER_DRIVE_FOLDER = 'App Engine LinkChecker'; var SITELINK_FINAL_URL_FIELD_ID = 5; var SITELINK_FINAL_MOBILE_URL_FIELD_ID = 6; var ROWS_CHUNK_SIZE = 4000; var URL_PREFIX_RE = /^https?:\/\//; var CONFIG_NAMED_RANGES = [ 'checkAdUrls', 'checkKeywordUrls', 'checkSitelinkUrls', 'checkPausedAds', 'checkPausedKeywords', 'checkPausedSitelinks', 'failureStrings', 'emailEachRun', 'emailOnCompletion', 'emailNonErrors', 'emailRecipients', 'frequency', 'dateStarted', 'dateCompleted', 'dateEmailed', 'ProjectId', 'SharedKey', 'ReadyStatusFlag' ]; var sharedKeyUrlFetch; var baseUrl; var config; var processedAccountFilename; /** * Main entry point. */ function main() { loadConfig(); if (!config.ReadyStatusFlag) { throw Error( 'The necessary Cloud setup steps have not been completed. ' + 'Please open the "Cloud Setup" tab of your configuration Sheets ' + 'document, and complete all the steps.'); } initializeSharedKeyFetch(); if (isOngoing() || isDateForUrlCheck()) { processAccounts(); } } /** * Gathers a list of remaining accounts to be processed and executes processing * on the next 50, if there are any remaining. If none are left to be processed * the function produces the final report. */ function processAccounts() { var processedAccountList = getProcessedAccountsList(); var accounts = getUnprocessedAccountsList(processedAccountList); if (accounts.length) { // Add a further up to 50 accounts to be processed AdsManagerApp.accounts() .withIds(accounts.slice(0, 50)) .executeInParallel('processAccount', 'processAccountCallback'); } else { var operations = listOperations(); var complete = isComplete(operations); if (complete && operations.length) { var numErrors = reportResults(operations); deleteOperations(operations); deleteProcessedAccountsList(); setDateAsNow('dateCompleted'); sendCompletedEmail(numErrors); } } } /** * Produces a mapping of URLs to entities for a specific sub account and saves * the mapping to a temporary file on Drive. * * @return {string} The name of the temporary file on Drive. */ function processAccount() { loadConfig(); var urlMap = loadUrlMap(); return writeTemporaryUrlMap(urlMap); } /** * Creates App Engine jobs for each set of URLs returned from the sub accounts. * * @param {!Array.<AdsManagerApp.ExecutionResult>} results */ function processAccountCallback(results) { loadConfig(); initializeSharedKeyFetch(); var processedAccounts = getProcessedAccountsList(); var accounts = []; for (var i = 0; i < results.length; i++) { var result = results[i]; var urlMap = extractUrlMap(result); var success = createOperations(urlMap); if (success) { accounts.push(result.getCustomerId()); } } if (!processedAccounts.length) { setDateAsNow('dateStarted'); sendStartEmail(); } saveProcessedAccountsList(processedAccounts.concat(accounts)); } /** * Loads a URL mapping from a temporary file for a given ExecutionResult as * returned from the parallel processing. * * @param {ExecutionResult} result The individual ExecutionResult object. * @return {!Object} The URL map. */ function extractUrlMap(result) { var folder = getLinkCheckerFolder(); var urlMap = {}; var file = DriveApp.getFileById(result.getReturnValue()); var contents = Utilities.unzip(file); var data = JSON.parse(contents[0].getDataAsString()); var customerId = result.getCustomerId(); var urls = Object.keys(data); for (var j = 0, url; url = urls[j]; j++) { if (!urlMap[url]) { urlMap[url] = {}; } urlMap[url][customerId] = data[url][customerId]; } folder.removeFile(file); return urlMap; } /** * Writes a URL map to Drive, compressing it. * * @param {!Object} urlMap The map to write. * @return {string} The temporary filename on Drive. */ function writeTemporaryUrlMap(urlMap) { var folder = getLinkCheckerFolder(); var compressed = Utilities.zip([Utilities.newBlob(JSON.stringify(urlMap))]); compressed.setName(Utilities.getUuid()); return folder.createFile(compressed).getId(); } /** * Obtains a list of unprocessed accounts within the MCC, given a list of * already processed accounts. * * @param {!Array.<string>} processed A list of processed account IDs to exclude * @return {!Array.<string>} The list of remaining accounts. */ function getUnprocessedAccountsList(processed) { var output = []; var accounts = AdsManagerApp.accounts().get(); while (accounts.hasNext()) { var customerId = accounts.next().getCustomerId(); if (processed.indexOf(customerId) === -1) { output.push(customerId); } } return output; } /** * Loads configuration values from the configuration spreadsheet based on the * list of Named Ranges. Configuration is assigned to a global variable. */ function loadConfig() { processedAccountFilename = 'processed-accounts-' + AdsApp.currentAccount().getCustomerId(); config = {}; var spreadsheet = SpreadsheetApp.openByUrl(CONFIG_SPREADSHEET_URL); for (var i = 0, namedRange; namedRange = CONFIG_NAMED_RANGES[i]; i++) { var range = spreadsheet.getRangeByName(namedRange); if (!range) { throw Error( 'Named range in configuration spreadsheet not found. The ' + 'spreadsheet must be created using "File > Make a copy". Simply ' + 'copying and pasting cells into a new document is insufficient to ' + 'copy across the required named ranges.'); } if (range.getNumRows() === 1 && range.getNumColumns() === 1) { config[namedRange] = range.getValue(); if (config[namedRange] === 'Yes') { config[namedRange] = true; } else if (config[namedRange] === 'No') { config[namedRange] = false; } } else { var values = range.getValues(); config[namedRange] = values.reduce(function(a, b) { var c = b.join(''); return c.length ? a.concat(c) : a; }, []); } } } /** * Initializes the SharedKeyUrlFetch object for facilitating requests to GAE. */ function initializeSharedKeyFetch() { sharedKeyUrlFetch = SharedKeyUrlFetchApp.withProjectId(config.ProjectId); var accountId = AdsApp.currentAccount().getCustomerId(); baseUrl = 'https://' + config.ProjectId + '.appspot.com/_ah/api/batchLinkChecker/v1/account/' + accountId + '/batchoperation'; } /** * Loads a map of URLs to campaign/ad group/entities based on the preferences * expressed in the configuration. The aim is to create a mapping such that if * a URL is used many times across multiple ads or keywords, it need only be * uploaded to App Engine once, and results can be reconstituted once retrieved * from App Engine. * * @return {!Object} The created map. */ function loadUrlMap() { var urlMap = {}; var servingCampaigns = getServingCampaignList(); if (config.checkAdUrls && (config.checkPausedAds || servingCampaigns.length)) { addAdsToUrlMap(urlMap, servingCampaigns); } if (config.checkKeywordUrls && (config.checkPausedKeywords || servingCampaigns.length)) { addKeywordsToUrlMap(urlMap, servingCampaigns); } if (config.checkSitelinkUrls && (config.checkPausedSitelinks || servingCampaigns.length)) { addSitelinksToUrlMap(urlMap, servingCampaigns); } return urlMap; } /** * Adds a URL to the map for a given combination of entity attributes. The aim * is that a record can be kept for each URL of the campaign, ad group, ad, * keyword etc. This allows only the URLs to be sent off to the App Engine * instance for checking, and then when returned, if multiple entities used * that URL, multiple records produced from the result. * * @param {!Object} urlMap The map to add to. * @param {string} customerId The customer ID. * @param {string} campaignName The name of the containing campaign. * @param {string} adGroupName The name of the containing ad group. * @param {string} type The type of entity: Ad, Keyword or Sitelink. * @param {string|number} id The ID (For ad, sitelink) or text (for keyword). * @param {string} url The URL. */ function addUrlToMap( urlMap, customerId, campaignName, adGroupName, type, id, url) { if (url && URL_PREFIX_RE.test(url)) { var urls = expandUrlModifiers(url); urls.forEach(function(expandedUrl) { if (!urlMap[expandedUrl]) { urlMap[expandedUrl] = {}; } var values = urlMap[expandedUrl]; if (!values[customerId]) { values[customerId] = {}; } var customer = values[customerId]; if (!customer[campaignName]) { customer[campaignName] = {}; } var campaign = customer[campaignName]; if (!campaign[adGroupName]) { campaign[adGroupName] = {}; } var adGroup = campaign[adGroupName]; if (!adGroup[type]) { adGroup[type] = {}; } adGroup[type][id] = true; }); } } /** * Retrieve a list of Campaigns that are currently enabled and serving. * * @return {!Object} Map of Ids */ function getServingCampaignList() { var campaigns = []; var report = AdsApp.report( 'SELECT CampaignId FROM ' + 'CAMPAIGN_PERFORMANCE_REPORT WHERE CampaignStatus = "ENABLED" AND ' + 'ServingStatus="SERVING" DURING LAST_14_DAYS'); var rows = report.rows(); while (rows.hasNext()) { campaigns.push(rows.next().CampaignId); } return campaigns; } /** * Adds URLs to a map from ads in the account. * * @param {!Object} urlMap The map to add to. * @param {!Array.<string>} servingCampaigns A list of serving campaigns. */ function addAdsToUrlMap(urlMap, servingCampaigns) { var customerId = AdsApp.currentAccount().getCustomerId(); var query = 'SELECT CampaignName, AdGroupName, Id, CreativeFinalUrls,' + 'CreativeFinalMobileUrls, CreativeDestinationUrl ' + 'FROM AD_PERFORMANCE_REPORT'; if (!config.checkPausedAds) { query += ' WHERE CampaignStatus = "ENABLED" AND AdGroupStatus = "ENABLED"' + ' AND Status = "ENABLED" AND CampaignId IN [' + servingCampaigns.join(',') + ']'; } else { query += ' WHERE CampaignStatus != "REMOVED" AND ' + 'AdGroupStatus != "REMOVED" AND Status != "DISABLED"'; } query += ' DURING LAST_14_DAYS'; var report = AdsApp.report(query); var rows = report.rows(); while (rows.hasNext()) { var row = rows.next(); if (row.CreativeFinalUrls !== '--') { var urls = JSON.parse(row.CreativeFinalUrls); urls.forEach(function(url) { addUrlToMap( urlMap, customerId, row.CampaignName, row.AdGroupName, 'Ad', row.Id, url); }); } if (row.CreativeFinalMobileUrls !== '--') { var urls = JSON.parse(row.CreativeFinalMobileUrls); urls.forEach(function(url) { addUrlToMap( urlMap, customerId, row.CampaignName, row.AdGroupName, 'Ad', row.Id, url); }); } if (row.CreativeDestinationUrl !== '--') { addUrlToMap( urlMap, customerId, row.CampaignName, row.AdGroupName, 'Ad', row.Id, row.CreativeDestinationUrl); } } } /** * Adds URLs to a map from keywords in the account. * * @param {!Object} urlMap The map to add to. * @param {!Array.<string>} servingCampaigns A list of serving campaigns. */ function addKeywordsToUrlMap(urlMap, servingCampaigns) { var customerId = AdsApp.currentAccount().getCustomerId(); var query = 'SELECT CampaignName, AdGroupName, AdGroupStatus, Criteria, ' + 'FinalUrls, FinalMobileUrls FROM KEYWORDS_PERFORMANCE_REPORT WHERE ' + 'ApprovalStatus = "APPROVED"'; if (!config.checkPausedKeywords) { query += ' AND CampaignStatus = "ENABLED" AND AdGroupStatus = "ENABLED"' + ' AND Status = "ENABLED" AND CampaignId IN [' + servingCampaigns.join(',') + ']'; } else { query += ' AND CampaignStatus != "REMOVED" AND ' + 'AdGroupStatus != "REMOVED" AND Status != "REMOVED"'; } query += ' DURING LAST_14_DAYS'; var report = AdsApp.report(query); var rows = report.rows(); while (rows.hasNext()) { var row = rows.next(); if (!config.checkPausedKeywords && row.AdGroupStatus !== 'enabled') { continue; } if (row.FinalUrls !== '--') { var urls = JSON.parse(row.FinalUrls); urls.forEach(function(url) { addUrlToMap( urlMap, customerId, row.CampaignName, row.AdGroupName, 'Keyword', row.Criteria, url); }); } if (row.FinalMobileUrls !== '--') { var urls = JSON.parse(row.FinalMobileUrls); urls.forEach(function(url) { addUrlToMap( urlMap, customerId, row.CampaignName, row.AdGroupName, 'Keyword', row.Criteria, url); }); } } } /** * Adds URLs to a map from sitelinks in the account. * * @param {!Object} urlMap The map to add to. * @param {!Array.<string>} servingCampaigns A list of serving campaigns. */ function addSitelinksToUrlMap(urlMap, servingCampaigns) { var customerId = AdsApp.currentAccount().getCustomerId(); var query = 'SELECT CampaignName, AdGroupName, AttributeValues, FeedItemId ' + 'FROM PLACEHOLDER_FEED_ITEM_REPORT WHERE PlaceholderType = 1 AND ' + 'DisapprovalShortNames=""'; if (!config.checkPausedSitelinks) { query += ' AND CampaignStatus = "ENABLED" AND AdGroupStatus = "ENABLED"' + ' AND Status = "ENABLED" AND CampaignId IN [' + servingCampaigns.join(',') + ']'; } else { query += ' AND CampaignStatus != "REMOVED" AND ' + 'AdGroupStatus != "REMOVED" AND Status != "REMOVED"'; } query += ' DURING LAST_14_DAYS'; var report = AdsApp.report(query); var rows = report.rows(); while (rows.hasNext()) { var row = rows.next(); var attributeValues = JSON.parse(row.AttributeValues); if (attributeValues[SITELINK_FINAL_URL_FIELD_ID]) { var urls = attributeValues[SITELINK_FINAL_URL_FIELD_ID]; if (Array.isArray(urls)) { urls.forEach(function(url) { addUrlToMap( urlMap, customerId, row.CampaignName, row.AdGroupName, 'Sitelink', row.FeedItemId, url); }); } } if (attributeValues[SITELINK_FINAL_MOBILE_URL_FIELD_ID]) { var urls = attributeValues[SITELINK_FINAL_MOBILE_URL_FIELD_ID]; if (Array.isArray(urls)) { urls.forEach(function(url) { addUrlToMap( urlMap, customerId, row.CampaignName, row.AdGroupName, 'Sitelink', row.FeedItemId, url); }); } } } } /** * Expands a URL that contains ValueTrack parameters such as {ifmobile:mobile} * to all the combinations, and returns as an array. * * @param {string} url The URL which may contain ValueTrack parameters. * @return {!Array.<string>} An array of one or more expanded URLs. */ function expandUrlModifiers(url) { var ifRegex = /({(if\w+):([^}]+)})/gi; var modifiers = {}; var matches; while (matches = ifRegex.exec(url)) { // Tags are case-insensitive, e.g. IfMobile is valid. modifiers[matches[2].toLowerCase()] = { substitute: matches[0], replacement: matches[3] }; } if (Object.keys(modifiers).length) { if (modifiers.ifmobile || modifiers.ifnotmobile) { var mobileCombinations = pairedUrlModifierReplace(modifiers, 'ifmobile', 'ifnotmobile', url); } else { var mobileCombinations = [url]; } // Store in a map on the offchance that there are duplicates. var combinations = {}; mobileCombinations.forEach(function(url) { if (modifiers.ifsearch || modifiers.ifcontent) { pairedUrlModifierReplace(modifiers, 'ifsearch', 'ifcontent', url) .forEach(function(modifiedUrl) { combinations[modifiedUrl] = true; }); } else { combinations[url] = true; } }); var modifiedUrls = Object.keys(combinations); } else { var modifiedUrls = [url]; } // Remove any custom parameters return modifiedUrls.map(function(url) { return url.replace(/{[0-9a-zA-Z\_\+\:]+}/g, ''); }); } /** * Takes a pair of URL modifiers (e.g. ifmobile, ifnotmobile) and returns two * URLs: One considering the first case, the other considering the second. * * @param {Object} modifiers The modifier objects found in the URL. * @param {string} modifier1 * @param {string} modifier2 * @param {string} url The templated URL. * @return {!Array.<string>} The pair of formed URLs. */ function pairedUrlModifierReplace(modifiers, modifier1, modifier2, url) { return [ urlModifierReplace(modifiers, modifier1, modifier2, url), urlModifierReplace(modifiers, modifier2, modifier1, url) ]; } /** * Creates a URL by substituting in the required value for one modifier, and * blanking out the other. * * @param {Object} mods The modifier objects found in the URL. * @param {string} mod1 The modifier to substitute in value for. * @param {string} mod2 The modifier to blank. * @param {string} url The templated URL. * @return {string} The formed URL. */ function urlModifierReplace(mods, mod1, mod2, url) { var modUrl = mods[mod1] ? url.replace(mods[mod1].substitute, mods[mod1].replacement) : url; return mods[mod2] ? modUrl.replace(mods[mod2].substitute, '') : modUrl; } /** * Determines whether a list of operations retrieved from the App Engine * instance is complete or whether some operations are still ongoing. * * @param {!Array.<!Object>} operations A list of operations. * @return {boolean} true if there are no processing operations. */ function isComplete(operations) { return !operations.some(function(op) { return op.status === 'PROCESSING'; }); } /** * Determines whether sufficient time has passed since the last check such that * it is time to launch a new URL check. * * @return {boolean} */ function isDateForUrlCheck() { var now = new Date(); if (!config.dateStarted && !config.dateCompleted) { return true; } if (config.dateCompleted && config.dateCompleted.getTime() >= config.dateStarted.getTime() && now.getTime() - config.dateCompleted.getTime() > MILLIS_PER_DAY * config.frequency) { return true; } return false; } /** * Sets the value of a named range in the configuration to the current Date. * * @param {string} namedRange The named range in the config spreadsheet. */ function setDateAsNow(namedRange) { var spreadsheet = SpreadsheetApp.openByUrl(CONFIG_SPREADSHEET_URL); spreadsheet.getRangeByName(namedRange).setValue(new Date()); } /** * Produces a report from a list of operations as retrieved from the App Engine * instance. * * @param {!Array.<!Object>} operations The list of operations. * @return {number} The total number of errors listed in the report. */ function reportResults(operations) { var errors = []; var totalCheckedResults = 0; var totalCheckedErrors = 0; for (var i = 0, op; op = operations[i]; i++) { var detail = getOperation(op.batchId); var urlMap = loadLookup(op.batchId); if (urlMap) { totalCheckedResults += detail.checkedUrlCount; if (detail.errors) { totalCheckedErrors += detail.errors.length; detail.errors.forEach(function(error) { var errorTable = errorUrlMapToArray(error, urlMap); Array.prototype.push.apply(errors, errorTable); }); } } } writeResultsSpreadsheet(totalCheckedResults, totalCheckedErrors, errors); return totalCheckedErrors; } /** * Creates a table for the spreadsheet report from a single error result for a * URL, and the map of campaigns/ad groups/entities that have use that URL. * * @param {!Object} error The error returned from App Engine, with url and * message properties. * @param {!Object} urlMap A map from URLs to entities structure. * @return {!Array.<!Array.<string>>} The table of results. */ function errorUrlMapToArray(error, urlMap) { var results = []; var mapEntry = urlMap[error.url]; var customerIds = Object.keys(mapEntry); for (var h = 0, customerId; customerId = customerIds[h]; h++) { var customerMap = mapEntry[customerId]; var campaigns = Object.keys(customerMap); for (var i = 0, campaign; campaign = campaigns[i]; i++) { var campaignMap = customerMap[campaign]; var adGroups = Object.keys(campaignMap); for (var j = 0, adGroup; adGroup = adGroups[j]; j++) { var adGroupMap = campaignMap[adGroup]; var types = Object.keys(adGroupMap); for (var k = 0, type; type = types[k]; k++) { var ids = Object.keys(adGroupMap[type]); for (var m = 0, id; id = ids[m]; m++) { results.push([ customerId, campaign, adGroup, type, id, error.url, error.message ]); } } } } } return results; } /** * Writes the results from the URL checking to the Results sheet. * * @param {number} totalCheckedResults The number of URLs checked. * @param {number} totalCheckedErrors The number of URLs that errored. * @param {!Array.<!Array.<string>>} errors The table of errors details. */ function writeResultsSpreadsheet( totalCheckedResults, totalCheckedErrors, errors) { var spreadsheet = SpreadsheetApp.openByUrl(CONFIG_SPREADSHEET_URL); spreadsheet.getRangeByName('numErrors').setValue(totalCheckedErrors); spreadsheet.getRangeByName('numChecked').setValue(totalCheckedResults); var topLeft = spreadsheet.getRangeByName('resultsTopLeft'); var sheet = spreadsheet.getSheetByName('Results'); var dataRange = sheet.getDataRange(); var firstRow = topLeft.getRow(); var firstCol = topLeft.getColumn(); var firstDataRow = Math.max(firstRow, dataRange.getNumRows()); sheet .getRange( firstRow, firstCol, firstDataRow - firstRow + 1, dataRange.getNumColumns() - firstCol + 1) .clear(); for (var s = 0; s < errors.length; s += ROWS_CHUNK_SIZE) { var chunk = errors.slice(s, s + ROWS_CHUNK_SIZE); sheet.getRange(firstRow + s, firstCol, chunk.length, chunk[0].length) .setValues(chunk); } } /** * Retrieve a list of operations from the App Engine instance. * * @return {!Array.<!Object>} A list of operations. */ function listOperations() { var response = sharedKeyUrlFetch.fetch(baseUrl); var data = JSON.parse(response.getContentText()); if (data.items) { return data.items; } return []; } /** * Creates operations to check URLs. URLs are supplied as the keys of a map * from URLs to a structure of the associated Google Ads entities. The map is * split up into batches for submission to the App Engine instance, and only * URLs are submitted. The mapping to entities is saved to Drive to allow the * results to be reconstituted for each entity when retrieved from the App * Engine instance. * * @param {!Object} urlMap A map from URLs to associated entities. * @return {boolean} Returns true if successful. */ function createOperations(urlMap) { var urlKeys = Object.keys(urlMap); var urls = []; var urlSubMap = {}; for (var i = 0, j = urlKeys.length; i < j; i++) { if (!(urls.length % URL_BATCH_SIZE)) { if (urls.length) { var success = createOperationAndSaveLookup(urls, urlSubMap); if (!success) { return false; } } urls = []; urlSubMap = {}; } urls.push(urlKeys[i]); urlSubMap[urlKeys[i]] = urlMap[urlKeys[i]]; } if (urls.length) { var success = createOperationAndSaveLookup(urls, urlSubMap); if (!success) { return false; } } return true; } /** * Creates an operation on the App Engine instance as well as saving the local * data required to interpret the results. Checks that both remote and local * operations were successful, and rolls back if it was not possible to create * both aspects. * * @param {!Array.<string>} urls A list of URLs to check. * @param {!Object} urlSubMap The URL mapping. * @return {boolean} Whether the creation and save was successful. */ function createOperationAndSaveLookup(urls, urlSubMap) { var success = false; try { var batchId = createOperation(urls); } catch (e) { Logger.log('Failed to create operation'); } if (batchId) { success = saveLookup(batchId, urlSubMap); // Rollback the creation of this job on App Engine if it was not possible to // create the local lookup. if (!success) { deleteOperation(batchId); } } return success; } /** * Creates an operation on the App Engine instance. * * @param {!Array.<string>} urlList A list of URLs to check. * @return {string} The ID of the newly-created operation. */ function createOperation(urlList) { var body = {urls: urlList}; if (config.failureStrings && config.failureStrings.length) { body.failureMatchTexts = config.failureStrings; } var options = { method: 'POST', contentType: 'application/json', payload: JSON.stringify(body), muteHttpExceptions: true }; var response = sharedKeyUrlFetch.fetch(baseUrl, options); var data = JSON.parse(response.getContentText()); if (data.items) { return data.items[0]; } throw Error('Failed to create new operation'); } /** * Retrieves the details of an operation. * * @param {string} operationId The ID of the operation. * @return {!Object} The results object. */ function getOperation(operationId) { var url = baseUrl + '/' + operationId; var response = sharedKeyUrlFetch.fetch(url); return JSON.parse(response.getContentText()); } /** * Deletes all operations in a list and the associated lookup file on Drive. * * @param {!Array.<!Object>} operations The operations to delete. */ function deleteOperations(operations) { operations.forEach(function(op) { deleteOperation(op.batchId); deleteLookup(op.batchId); }); } /** * Deletes an operation. * * @param {string} operationId The ID of the operation. */ function deleteOperation(operationId) { var url = baseUrl + '/' + operationId; var options = {method: 'DELETE'}; sharedKeyUrlFetch.fetch(url, options); } /** * Tests whether there is an ongoing check. * * @return {boolean} */ function isOngoing() { var folder = getLinkCheckerFolder(); var files = folder.getFilesByName(processedAccountFilename); return files.hasNext(); } /** * Retrieves a list of accounts already processed. * * @return {!Array.<string>} An array of account IDs. */ function getProcessedAccountsList() { var folder = getLinkCheckerFolder(); var files = folder.getFilesByName(processedAccountFilename); if (files.hasNext()) { return JSON.parse(files.next().getBlob().getDataAsString()); } return []; } /** * Saves the list of accounts processed. * * @param {!Array.<string>} list The list of accounts. */ function saveProcessedAccountsList(list) { var folder = getLinkCheckerFolder(); var files = folder.getFilesByName(processedAccountFilename); if (files.hasNext()) { files.next().setContent(JSON.stringify(list)); } else { folder.createFile(processedAccountFilename, JSON.stringify(list)); } } /** * Removes the list of processed accounts. */ function deleteProcessedAccountsList() { var folder = getLinkCheckerFolder(); var files = folder.getFilesByName(processedAccountFilename); if (files.hasNext()) { folder.removeFile(files.next()); } } /** * Retrieves the folder used for temporary files, creating it first if needed. * * @return {Folder} */ function getLinkCheckerFolder() { var folders = DriveApp.getFoldersByName(LINK_CHECKER_DRIVE_FOLDER); if (!folders.hasNext()) { return DriveApp.createFolder(LINK_CHECKER_DRIVE_FOLDER); } return folders.next(); } /** * Retrieves a URL mapping from drive for a given operation batch ID. * * @param {string} operationId The ID of the operation. * @return {!Object} The URL mapping. */ function loadLookup(operationId) { var folder = getLinkCheckerFolder(); var files = folder.getFilesByName(operationId); if (files.hasNext()) { var file = files.next(); var contents = Utilities.unzip(file); return JSON.parse(contents[0].getDataAsString()); } // As the associated lookup for this operation could not be found on Drive, // the operation should be deleted from the App Engine instance, as the // operation could never be fully processed now. deleteOperation(operationId); Logger.log( 'Could not find any link checker files on Drive in the folder ' + 'named ' + folder.getName() + '. Please check that this folder exists. ' + 'If it cannot be found, start a new analysis by clearing the values in ' + '"Started analysis" and "Finished analysis" in the configuration ' + 'spreadsheet.'); } /** * Saves a URL mapping to Drive, for a specified operation ID. * * @param {string} operationId The ID of the operation. * @param {!Object} urlSubMap The URL mapping. * @return {boolean} Whether the save operation was a success or not. */ function saveLookup(operationId, urlSubMap) { var folder = getLinkCheckerFolder(); try { var contents = JSON.stringify(urlSubMap); var compressed = Utilities.zip([Utilities.newBlob(contents)]); compressed.setName(operationId); var file = folder.createFile(compressed); } catch (e) { Logger.log('An error occurred trying to save the lookup for operation ' + operationId + ' : ' + e); return false; } if (!file) { Logger.log('The lookup file for operation ' + operationId + ' did not save correctly'); return false; } return true; } /** * Deletes a URL mapping from Drive. * * @param {string} operationId The ID of the URL mapping to delete. */ function deleteLookup(operationId) { var folder = getLinkCheckerFolder(); var files = folder.getFilesByName(operationId); var forDeletion = []; while (files.hasNext()) { forDeletion.push(files.next()); } forDeletion.forEach(function(file) { folder.removeFile(file); }); } /** * Sends an email when the link checking has started. */ function sendStartEmail() { var customerId = AdsApp.currentAccount().getCustomerId(); if (config.emailEachRun && config.emailRecipients && config.emailRecipients.length) { MailApp.sendEmail( config.emailRecipients.join(','), customerId + ': Link Checker: Checking started', 'Link checking has started on account: ' + customerId + '. For more details see the spreadsheet: ' + CONFIG_SPREADSHEET_URL); setDateAsNow('dateEmailed'); } } /** * Sends an email when the link checking has completed. * * @param {number} numErrors The number of errors in the report. */ function sendCompletedEmail(numErrors) { var customerId = AdsApp.currentAccount().getCustomerId(); if (config.emailOnCompletion && (config.emailNonErrors || numErrors > 0) && config.emailRecipients && config.emailRecipients.length) { MailApp.sendEmail( config.emailRecipients.join(','), customerId + ': Link Checker: Checking completed', 'Link checking has completed on account: ' + customerId + '. For more details see the spreadsheet: ' + CONFIG_SPREADSHEET_URL); setDateAsNow('dateEmailed'); } } /** * Sends an email when the link checking is ongoing. */ function sendProcessingEmail() { var customerId = AdsApp.currentAccount().getCustomerId(); if (config.emailEachRun && config.emailRecipients && config.emailRecipients.length) { MailApp.sendEmail( config.emailRecipients.join(','), customerId + ': Link Checker: Checking is ongoing', 'Link checking is still running on account: ' + customerId + '. For more details see the spreadsheet: ' + CONFIG_SPREADSHEET_URL); setDateAsNow('dateEmailed'); } } /** * Adds a SharedKeyUrlFetchApp object, for creating requests to the App Engine * instance using a shared key. * * @param {!Object} scope The object to add SharedKeyUrlFetchApp to. */ (function(scope) { /** * Creates an object for making authenticated URL fetch requests with a * given sharedKey * * @constructor */ function SharedKeyUrlFetchApp_() { this.sharedKey_ = config.SharedKey; } /** * Performs an HTTP request for the given URL. * Has retries, as the service may not have initialised or spun up. * * @param {string} url The URL to fetch * @param {?Object=} opt_options Options as per UrlFetchApp.fetch * @return {!HTTPResponse} The HTTP Response object. */ SharedKeyUrlFetchApp_.prototype.fetch = function(url, opt_options) { var attempts = 5; var fetchOptions = opt_options || {}; fetchOptions.muteHttpExceptions = true; fetchOptions.headers = {Authorization: this.sharedKey_}; var response = UrlFetchApp.fetch(url, fetchOptions); while (response.getResponseCode() >= 400 && attempts) { Utilities.sleep(FIFTEEN_SECS_MILLIS); attempts--; response = UrlFetchApp.fetch(url, fetchOptions); } if (response.getResponseCode() >= 400) { throw Error( 'Error talking to Google App Engine UrlFetch service: ' + response.getResponseCode() + ' : ' + response.getContentText()); } return response; }; function withProjectId(projectId) { return new SharedKeyUrlFetchApp_(); } scope.SharedKeyUrlFetchApp = {withProjectId: withProjectId}; })(this);