Search auxiliary files
Subsystem: Search 🔎 Aside from metadata stored in the Azure Search indexes, there is data stored in Azure Blob Storage for bookkeeping and performance reasons. These data files are called auxiliary files. The data files mentioned here are those explicitly managed by the search subsystem. Other data files exist (manually created, created by the statistics subsystem, etc.). Those will not be covered here but are mentioned in the job-specific documentation that uses them as input. Each search auxiliary file is copied to the individual region that a search service is deployed. For nuget.org, we run search in four regions, so there are four copies of each of these files. The search auxiliary files are:downloads/downloads.v2.json- total download count for every package versionowners/owners.v2.jsonand change history - owners for every package IDverified-packages/verified-packages.v1.json- package IDs that are verifiedpopularity-transfers/popularity-transfers.v1.json- popularity transfers between package IDs
ExcludedPackages.v1.json- package IDs excluded from the default search results
Download count data
Thedownloads/downloads.v2.json file has the total download count for all package versions. The total download count
for a package ID as a whole can be calculated simply by adding all version download counts.
The downloads data file looks like this:
NuGetVersion normalization rules
(e.g. no build metadata will appear, no leading zeroes, etc.).
If a package ID or version does not exist in the data file, this only indicates that there is no download count data and
does not imply that the package ID or version does not exist on the package source. It is possible for package IDs or
versions that do not exist (perhaps due to deletion) to exist in the data file.
The order of the IDs and versions in the file is undefined.
This file has a “v2” in the file name because it is the second version of this data. The “v1” format is still produced
by the statistics subsystem and has a less friendly data format.
The class for reading and writing this file to Blob Storage is DownloadDataClient.
Package ownership data
Theowners/owners.v2.json file contains the owner information about all package IDs. Each time this file is updated,
the set of package IDs that changed is written to a “change history” file with a path pattern like
owners/changes/TIMESTAMP.json.
The class for reading and writing these files to Blob Storage is OwnerDataClient.
owners/owners.v2.json
The owners data file looks like this:
Change history
The change history files do not contain owner usernames for GDPR reasons but mention all of the package IDs that had ownership changes since the last time that theowners.v2.json file was generated. If a package ID is not mentioned in
a file, that means that there were no ownership changes in the time window. An ownership change is defined as one or
more owners being added or removed from the set of owners for that package ID.
Each change history data file has a file name with timestamp format yyyy-MM-dd-HH-mm-ss-FFFFFFF (UTC) and a file
extension of .json.
The files look like this:
Verified packages data
Theverified-packages/verified-packages.v1.json data file contains all package IDs that are considered verified by the prefix reservation feature. This essentially defines the verified checkmark icon in the search UIs.
The data file looks like this:
VerifiedPackagesDataClient.
Popularity transfer data
Thepopularity-transfers/popularity-transfers.v1.json data file has a mapping of all package IDs that have
transferred their popularity to one or more other packages.
The data file looks like this:
PopularityTransferDataClient.
Excluded packages
TheExcludedPackages.v1.json file is not present in the region-specific storage accounts and is only used during the
index rebuild process. It contains a list of package IDs that should be excluded from the default search. Default search
is the search query that has no search text at all (empty search text). It is displayed on NuGet.org when you click the
“Packages” link in the navigation tab and in the Visual Studio Package Manager UI, on the Browse tab.
This file is used to prevent the default search results from being filled with somewhat uninteresting, high download
packages that ship as part of the .NET BCL. These packages are rarely installed manually so it’s not useful to show them
in the default search results.
Note that this is not the same as unlisting the package. An unlisted package does not appear in any searches against
the Search index. Package IDs mentioned in the excluded packages list do appear
in any search that has some search text (e.g. searching for that specific excluded package ID).
The data file looks like this: