Skip to main content

SnapshotAzureBlob

Overview

SnapshotAzureBlob is a small console-based background job that iterates every blob in a specified Azure Blob Storage container and creates a point-in-time snapshot for each blob that does not yet have one. It is designed to run on a schedule (continuously or once) and provides a simple, low-configuration safety net against accidental blob deletion or corruption. The job targets .NET Framework 4.7.2 and integrates with the shared NuGet.Jobs.Common infrastructure used across all NuGet gallery background jobs.
A snapshot is only created when a blob has exactly one entry in its blob listing (i.e., the base blob itself with no existing snapshot). This makes the job idempotent for blobs that already have at least one snapshot — it will never pile up duplicate snapshots in a single run.

Role in System

Within the NuGetGallery ecosystem, blob storage containers hold essential artifacts such as package files, statistics exports, and auxiliary search data. SnapshotAzureBlob acts as a data-durability guardrail: by maintaining at least one snapshot per blob, operations teams can restore a prior version of any blob without needing a full storage account backup.

Data Protection

Prevents permanent data loss from accidental writes or deletions by preserving a recoverable snapshot of every blob.

Scheduled Job

Runs via the shared JobRunner loop — can execute once or continuously with a configurable sleep interval.

Parallel Execution

Uses Parallel.ForEach to snapshot blobs concurrently, minimising total wall-clock time for large containers.

Application Insights

Emits structured telemetry through Microsoft.Extensions.Logging with reserved event IDs in the 800 range.

Key Files / Classes

FileClass / TypePurpose
Program.csProgramEntry point. Instantiates SnapshotAzureBlobJob and hands off to JobRunner.Run().
SnapshotAzureBlobJob.csSnapshotAzureBlobJobCore job logic. Implements JobBase.Init() and JobBase.Run(). Parses args, connects to Azure Storage, and drives the snapshot loop.
ArgumentNames.csArgumentNamesStatic constants for the two required CLI arguments: ConnectionString and Container.
LogEvents.csLogEventsDefines structured EventId constants in the reserved 800–802 range for this job.

Dependencies

NuGet Packages

PackagePurpose
WindowsAzure.StorageLegacy Azure Blob Storage SDK. Provides CloudStorageAccount, CloudBlobClient, CloudBlockBlob, and the synchronous Snapshot() API.
System.ComponentModel.EventBasedAsyncProvides IServiceContainer used by JobBase.Init() for service resolution.

Internal Project References

ProjectPurpose
NuGet.Jobs.CommonShared job infrastructure: JobBase, JobRunner, JobConfigurationManager, Application Insights wiring, and logging setup.

Runtime Arguments

The job accepts the following command-line arguments passed as -ArgName value pairs:
ArgumentRequiredDescription
-ConnectionStringYesAzure Storage connection string for the account containing the target container.
-ContainerYesName of the blob container whose blobs should be snapshotted.
-InstrumentationKeyRecommendedApplication Insights instrumentation key for telemetry.
-OnceNoRun a single iteration instead of looping continuously.
-SleepNoMilliseconds to sleep between continuous iterations (default: 5000).
-IntervalNoSeconds between iterations (alternative to -Sleep).

Notable Patterns and Implementation Details

Snapshot guard logicEnsureOneSnapshot performs a flat container listing with BlobListingDetails.None, then re-queries each blob by name prefix with BlobListingDetails.Snapshots. Only when expandedList.Count() == 1 (base blob present, no snapshots yet) does it call blob.Snapshot(). A container that was already snapshotted will produce zero new snapshots on the next run.
Legacy Storage SDK — The project uses the older WindowsAzure.Storage package rather than the current Azure.Storage.Blobs SDK. The classic CloudBlockBlob.Snapshot() method is synchronous; the Parallel.ForEach loop compensates with thread-level concurrency but cannot benefit from async I/O.
Synchronous Run() body — The Run() override returns Task.FromResult(0) immediately after calling EnsureOneSnapshot, which performs all network I/O synchronously inside Parallel.ForEach. This is acceptable for a standalone console job but would block a thread-pool thread in any async-first host.
Per-blob error isolation — Each blob’s snapshot attempt is wrapped in an individual try/catch inside the Parallel.ForEach lambda. A failure for one blob logs a Critical event (event ID 802) and processing continues for all remaining blobs rather than aborting the run.
// Core snapshot guard — SnapshotAzureBlobJob.EnsureOneSnapshot
Parallel.ForEach(blobList, (item) =>
{
    var blob = item as CloudBlockBlob;
    if (blob != null)
    {
        var expandedList = container.ListBlobs(
            prefix: blob.Name,
            useFlatBlobListing: true,
            blobListingDetails: BlobListingDetails.Snapshots).ToList();

        if (expandedList.Count() == 1)  // base blob only — no snapshot exists yet
        {
            Interlocked.Increment(ref snapshotCount);
            blob.Snapshot();
        }
    }
});

Log Events

Event IDNameSeverityWhen Emitted
800JobRunFailedReserved; surfaced by JobRunner on unhandled run failure.
801JobInitFailedReserved; surfaced by JobRunner on initialisation failure.
802SnaphotFailedCriticalEmitted per-blob when blob.Snapshot() throws an exception.
Event ID 802 is named SnaphotFailed (single ‘s’ — “Snashot”) in the source. This is a typo that has been preserved as-is to avoid breaking any existing Application Insights queries or alerts that filter on the event name string.