Overview
NuGet.Services.Build is a small, purpose-built MSBuild extension library that ships as both a compiled assembly and as a NuGet package (package ID NuGet.Services.Build, version 1.0.0). Its primary role is to act as a shared component for the repository-wide build and code-signing infrastructure. It is not a runtime service — it is compiled and consumed entirely during the build pipeline.
The project contains two distinct components. The first is FindDuplicateFiles, a custom MSBuild Task that receives a list of file paths and partitions them into unique files and content-identical duplicates using a three-stage comparison strategy: file size, a SHA-256 hash of the first 1,024 bytes (the “leading hash”), and a full SHA-256 hash of the entire file. This tiered approach avoids reading entire large files unless necessary. The task also supports an optional SkipAuthenticodeSubjects parameter that filters out already-signed binaries (identified by their Authenticode certificate subject) from the duplicate list — this is specifically needed for test-signing scenarios where MicroBuild fails if it encounters a file that is already fully signed.
The second component is VisualStudioSetupConfigurationHelper, a static helper class that uses the Microsoft.VisualStudio.Setup.Configuration.Interop COM API to enumerate all installed Visual Studio instance paths on the build machine. This is used by the PowerShell build infrastructure (build/common.ps1) to locate the correct MSBuild executable for a given Visual Studio major version.
The project is intentionally constrained to C# 5 (
<LangVersion>5</LangVersion>). This is required because FindDuplicateFiles.cs is also consumed directly as raw source by MSBuild’s CodeTaskFactory (in build/FindDuplicateFiles.targets), which only supports C# 5 syntax. The same source file is therefore used in two different compilation modes simultaneously.Role in System
FindDuplicateFiles MSBuild Task
Custom
Task subclass that partitions a file list into UniqueFiles and DuplicateFiles item groups using size → leading-hash → full-hash comparison. Duplicate items receive a DuplicateOf metadata field pointing to the canonical copy.Authenticode Subject Filtering
The
SkipAuthenticodeSubjects parameter (semicolon-delimited) lets the signing pipeline exclude files already signed with the real Microsoft key during test-signing runs, preventing MicroBuild errors caused by re-signing fully-signed assemblies.Dual-mode Source File
FindDuplicateFiles.cs lives under build/ and is consumed both as a CodeTaskFactory inline task (for build targets that don’t reference the assembly) and as a linked <Compile> item inside the NuGet.Services.Build project (which produces the distributable assembly).VS Setup Discovery
VisualStudioSetupConfigurationHelper.GetInstancePaths() enumerates all VS installations via COM interop, enabling build/common.ps1 to locate the appropriate MSBuild.exe without hardcoding version-specific paths.Key Files and Classes
| File | Class / Type | Purpose |
|---|---|---|
src/NuGet.Services.Build/NuGet.Services.Build.csproj | Project | Targets net472; links build/FindDuplicateFiles.cs and compiles VisualStudioSetupConfigurationHelper.cs; references Microsoft.Build.Framework and Microsoft.Build.Utilities.v4.0 as GAC references |
src/NuGet.Services.Build/VisualStudioSetupConfigurationHelper.cs | VisualStudioSetupConfigurationHelper (static class) | Wraps the SetupConfiguration COM class to enumerate all Visual Studio installation paths on the current machine |
build/FindDuplicateFiles.cs | FindDuplicateFiles (MSBuild Task) | Accepts Files item list; outputs UniqueFiles and DuplicateFiles; performs size → leading-SHA256 → full-SHA256 deduplication; optionally skips Authenticode-signed files by subject name |
build/FindDuplicateFiles.cs (inner) | TaskItemInfo (private class) | Holds per-file state during comparison: ITaskItem reference, full path, file size, header hash, and full hash |
build/FindDuplicateFiles.targets | MSBuild targets file | Registers FindDuplicateFiles as a CodeTaskFactory-compiled inline task so it can be used without the compiled assembly |
build/sign.microbuild.targets | MSBuild targets file | Defines EnumerateFilesToSign, DedupeFilesToSign, and CopySignedFiles targets; imported by every project; handles both batch and per-project signing modes |
build/sign-binaries.proj | MSBuild project | Batch signing entry point; calls EnumerateFilesToSign across all src/**/*.csproj files, then deduplicates and submits to MicroBuild |
build/sign-packages.proj | MSBuild project | NuGet package signing; collects artifacts/*.nupkg and submits them for NuGet Authenticode signing via MicroBuild |
tests/NuGet.Services.Build.Tests/FindDuplicateFilesFacts.cs | FindDuplicateFilesFacts (xUnit) | Tests covering missing files, path-identical duplicates, size-distinct files, leading-byte-distinct files, and content-identical duplicates across file sizes from 0 to 32,768 bytes |
Dependencies
NuGet Package References
| Package | Purpose |
|---|---|
Microsoft.VisualStudio.Setup.Configuration.Interop v1.8.24 | COM interop assembly for the Visual Studio Setup Configuration API; used by VisualStudioSetupConfigurationHelper to enumerate VS installations |
Framework / GAC References
| Reference | Purpose |
|---|---|
Microsoft.Build.Framework | Provides ITaskItem, IBuildEngine, and MSBuild event argument types used by FindDuplicateFiles |
Microsoft.Build.Utilities.v4.0 | Provides the Task base class that FindDuplicateFiles extends |
Internal Project References
| Project | Purpose |
|---|---|
tests/NuGet.Services.Build.Tests | xUnit test project; references NuGet.Services.Build directly via <ProjectReference> |
Notable Patterns and Implementation Details
The three-stage file comparison in
FindDuplicateFiles is optimized to avoid unnecessary I/O. Files with unique sizes are immediately classified as unique without reading any content. Only files that share a size are read for the 1,024-byte leading hash, and only files that share a leading hash are fully hashed. This mirrors the approach described in the Stack Overflow answer cited in the source code comments.- Batch vs. per-project signing: The
BatchSignproperty (defaulttrue) controls whether each project signs its own output after build or defers tosign-binaries.proj. In batch mode,sign.microbuild.targetssetsSkipEnumerateFilesToSign=trueduring the main solution build and only enumerates files when explicitly invoked bysign-binaries.projwithSkipEnumerateFilesToSign=false. - Web app package signing: When
$(WebPublishMethod) == 'Package',EnumerateFilesToSignruns afterPipelineCopyAllFilesToOneFolderForMsdeployrather than afterAfterBuild, and thePackageUsingManifesttarget depends onCopySignedFilesso that the web deploy package always contains signed binaries. - Signing certificate codes: First-party managed assemblies use
Authenticode=Microsoft400andStrongName=MsSharedLib72. Third-party binaries useAuthenticode=3PartySHA2. NuGet packages useAuthenticode=NuGet. PowerShell scripts useAuthenticode=Microsoft400only. RepositoryRootDirectoryresolution:sign.microbuild.targetslocates the repo root by walking up from the targets file until it findsbuild.ps1usingMSBuild::GetDirectoryNameOfFileAbove. This allows the targets to glob for all copies of a given output assembly across all project bin directories in the repository.