EmbededFileProvider doesn't work with paths/filenames with special characters
Closed this issue ยท 31 comments
If you have embeded resource in folder containing dash "-" in its name it gets translated into underscore inside an assembly. EmbededFileProvider is not respecting that.
I've put together quick (and probably dirty) fix:
var sc = tsp.Count(c => c == '/' || c == '\\');
var sp = 0;
for (var i = _baseNamespace.Length; i < builder.Length; i++) {
if (builder[i] == '/' || builder[i] == '\\') {
sp++;
builder[i] = '.';
} else if (builder[i] == '-' && sp < sc) {
builder[i] = '_';
}
}
@BrennanConroy Could you investigate?
Confirmed folders with '-' are converted to '_' in the Assembly
Would we like provider.GetFileInfo("test-name.resource");
and provider.GetFileInfo("test_name.resource");
to have the same behavior? (assuming test-name is the folder the resource is in)
@BrennanConroy I recall discussing this a while ago - do we even know a way to fix this? Aren't there ultimately always going to be ambiguities in the names of embedded resources due to how the compilation system stores the paths?
Yes there will be an ambiguity between folder names with -
and files named the same but with _
/cc @victorhurdugaci
Yeah, those names have to be valid C# identifiers and that's why -
gets converted to _
. Of course if you have a-b
and a_b
as two different names, you're going to get a name collisions
So can we do anything about this? What would a theoretical, regardless-of-cost fix look like?
It doesn't look like there's a namespace/directory hierarchy for assembly resources (though one exists for languages).
One ugly, not even fully correct mitigation could be converting - to ___ (or something similar) instead.
I remember having a similar problem in DNX time where environment variable names could not contain certain characters in Linux and OS X, and I believe we had done something like this there.
The name transformation function doesn't have an inverse. Once a name is converted, you cannot convert it back.
@muratg ___
would not work because a-a
and a___a
would collide. It's a one way transformation, nothing you can do about it. The environment variable is different because there you don't support the other character, here you support both.
Maybe we discussed this but I don't remember, why can't the embedded file provider do the transformation before reading?
Can you not use a tactic similar to preserveCompilationContext
i.e at build time, produce and embed the additional information needed (i.e a file) to preserve the directory information (original file path) via mapping?
For example, if you embedded these two files at build time:
/myfiles/sometextfiles/my_text-file.min.js
/myfiles/sometextfiles/my_text-file2.min.js
Then you would also produce an additional manifest file and embed that too, let's call it resourcePaths.txt
and its contents might look something like this:
myfiles.sometextfiles.my_text_file.min.js | myfiles/sometextfiles/my_text-file.min.js
myfiles.sometextfiles.my_text_file2.min.js | myfiles/sometextfiles/my_text-file.min.js
It's a simple mapping for the resource name and the original file path.
Now at runtime, you can get the original path information for any given resource by loading from this file and mapping. EmbeddedFileProvider could use that.
I think the problem here is that it's the .NET CLI system / CSC.exe that are doing the embedding. There's no "custom code" that runs at that point that would be able to generate these mappings. Though, I agree that if there were a way to do that, it could very well work.
@Eilon - yeah ideally this would be taken care of at that level - is this something perhaps for the dotnet cli folks to chime in on?
However you could
also take care of it using a custom tool, plugged in to the precompile
step of a project. The tool would evaluate the embed
section of the project.json
and produce the mapping file:
myfiles.sometextfiles.my_text_file.min.js | myfiles/sometextfiles/my_text-file.min.js
myfiles.sometextfiles.my_text_file2.min.js | myfiles/sometextfiles/my_text-file.min.js
The tool could also modify the embed
section itself to ensure this mapping file itself get's embedded.
Now at runtime, the EmbeddedFileProvider
can load this file and use it to recover directory information.
Any news on when we can expect a solution for this issue? I am debating whether to wait for a fix or fork and try and put in a temporary fix to use in the interim.
Decided to do the following as a temporary measure:
public class HyphenFriendlyEmbeddedFileProvider : IFileProvider
{
private readonly EmbeddedFileProvider _embeddedFileProvider;
public HyphenFriendlyEmbeddedFileProvider(EmbeddedFileProvider embeddedFileProvider)
{
_embeddedFileProvider = embeddedFileProvider;
}
public IFileInfo GetFileInfo(string subpath)
{
if (string.IsNullOrEmpty(subpath))
{
return new NotFoundFileInfo(subpath);
}
// does the subpath contain directory?
var indexOfLastSeperator = subpath.LastIndexOf('/');
if (indexOfLastSeperator == -1)
{
return _embeddedFileProvider.GetFileInfo(subpath);
}
// Does it contain a hyphen?
var indexOfFirstHyphen = subpath.IndexOf('-');
if (indexOfFirstHyphen == -1)
{
// no hyphens.
return _embeddedFileProvider.GetFileInfo(subpath);
}
// is hyphen within the directory portion?
if (indexOfFirstHyphen > indexOfLastSeperator)
{
// nope
return _embeddedFileProvider.GetFileInfo(subpath);
}
// Ok, re-write directory portion, from the first hyphen, replacing hyphens!
var builder = new StringBuilder(subpath.Length);
builder.Append(subpath);
for (int i = indexOfFirstHyphen; i < indexOfLastSeperator; i++)
{
if (builder[i] == '-')
{
builder[i] = '_';
}
}
var normalisedPath = builder.ToString();
return _embeddedFileProvider.GetFileInfo(normalisedPath);
}
public IDirectoryContents GetDirectoryContents(string subpath)
{
var contents = _embeddedFileProvider.GetDirectoryContents(subpath);
return contents;
}
public IChangeToken Watch(string filter)
{
return _embeddedFileProvider.Watch(filter);
}
}
and
var embeddedFileProvider= new EmbeddedFileProvider(assembly, assyName.Name);
var hyphenFriendlyEmbeddedFileProvider = new HyphenFriendlyEmbeddedFileProvider(embeddedFileProvider);
var fileInfo = hyphenFriendlyEmbeddedFileProvider.GetFileInfo("/this-works/js-cookie.min.js");
Turns out wrapping an EmbeddedFileProvider
only works up to a point. I wanted to add the illusion of directories, but that meant having to wrap the IFileInfo's also so i could "modify" the Name so that they didn't include the "directory" portion in the Name. Couldn't justify all the wrapper code in the end, so just created a new class instead:
/// <summary>
/// I have written this to workaround a present issue: https://github.com/aspnet/FileSystem/issues/184
/// </summary>
public class DirectoryFriendlyEmbeddedFileProvider : IFileProvider
{
private static readonly char[] _invalidFileNameChars = Path.GetInvalidFileNameChars().Where(c => c != '/' && c != '\\').ToArray();
private readonly Assembly _assembly;
private readonly string _baseNamespace;
private readonly DateTimeOffset _lastModified;
/// <summary>
/// Initializes a new instance of the <see cref="EmbeddedFileProvider" /> class using the specified
/// assembly and empty base namespace.
/// </summary>
/// <param name="assembly"></param>
public DirectoryFriendlyEmbeddedFileProvider(Assembly assembly)
: this(assembly, assembly?.GetName()?.Name)
{
}
/// <summary>
/// Initializes a new instance of the <see cref="EmbeddedFileProvider" /> class using the specified
/// assembly and base namespace.
/// </summary>
/// <param name="assembly">The assembly that contains the embedded resources.</param>
/// <param name="baseNamespace">The base namespace that contains the embedded resources.</param>
public DirectoryFriendlyEmbeddedFileProvider(Assembly assembly, string baseNamespace)
{
if (assembly == null)
{
throw new ArgumentNullException("assembly");
}
_baseNamespace = string.IsNullOrEmpty(baseNamespace) ? string.Empty : baseNamespace + ".";
_assembly = assembly;
_lastModified = DateTimeOffset.UtcNow;
// need to keep netstandard1.0 until ASP.NET Core 2.0 because it is a breaking change if we remove it
#if NETSTANDARD1_5 || NET451
if (!string.IsNullOrEmpty(_assembly.Location))
{
try
{
_lastModified = File.GetLastWriteTimeUtc(_assembly.Location);
}
catch (PathTooLongException)
{
}
catch (UnauthorizedAccessException)
{
}
}
#endif
}
public IFileInfo GetFileInfo(string subpath)
{
if (string.IsNullOrEmpty(subpath))
{
return new NotFoundFileInfo(subpath);
}
var name = Path.GetFileName(subpath);
var encodedPath = EncodeAsResourcesPath(subpath);
var resourcePath = _baseNamespace + encodedPath;
if (HasInvalidPathChars(resourcePath))
{
return new NotFoundFileInfo(resourcePath);
}
if (_assembly.GetManifestResourceInfo(resourcePath) == null)
{
return new NotFoundFileInfo(name);
}
return new EmbeddedResourceFileInfo(_assembly, resourcePath, name, _lastModified);
}
public IDirectoryContents GetDirectoryContents(string subpath)
{
// The file name is assumed to be the remainder of the resource name.
if (subpath == null)
{
return new NotFoundDirectoryContents();
}
var encodedPath = EncodeAsResourcesPath(subpath);
var resourcePath = _baseNamespace + encodedPath;
if (!resourcePath.EndsWith("."))
{
resourcePath = resourcePath + ".";
}
var entries = new List<IFileInfo>();
// We will assume that any file starting with this path, is in that directory.
// NOTE: This may include false positives, but helps in the majority of cases until
// https://github.com/aspnet/FileSystem/issues/184 is solved.
// TODO: The list of resources in an assembly isn't going to change. Consider caching.
var resources = _assembly.GetManifestResourceNames();
for (var i = 0; i < resources.Length; i++)
{
var resourceName = resources[i];
if (resourceName.StartsWith(resourcePath))
{
entries.Add(new EmbeddedResourceFileInfo(
_assembly,
resourceName,
resourceName.Substring(resourcePath.Length),
_lastModified));
}
}
return new EnumerableDirectoryContents(entries);
}
public IChangeToken Watch(string filter)
{
return NullChangeToken.Singleton;
}
protected virtual string EncodeAsResourcesPath(string subPath)
{
var builder = new StringBuilder(subPath.Length);
// does the subpath contain directory portion - if so we need to encode it.
var indexOfLastSeperator = subPath.LastIndexOf('/');
if (indexOfLastSeperator != -1)
{
// has directory portion to encode.
for (int i = 0; i <= indexOfLastSeperator; i++)
{
var currentChar = subPath[i];
if (currentChar == '/')
{
if (i != 0) // omit a starting slash (/), encode any others as a dot
{
builder.Append('.');
}
continue;
}
if (currentChar == '-')
{
builder.Append('_');
continue;
}
builder.Append(currentChar);
}
}
// now append (and encode as necessary) filename portion
if (subPath.Length > indexOfLastSeperator + 1)
{
// has filename to encode
for (int c = indexOfLastSeperator + 1; c < subPath.Length; c++)
{
var currentChar = subPath[c];
// no encoding to do on filename - so just append
builder.Append(currentChar);
}
}
return builder.ToString();
}
private static bool HasInvalidPathChars(string path)
{
return path.IndexOfAny(_invalidFileNameChars) != -1;
}
}
Anything regarding this problem? It really messes up with a lot of technologies used today, like npm, for instance.
When using libraries like Angular 2, you need to reference the dependencies that are stored (npm-style) like "../lib/@angular/core/index.js". The EmbededFileProvider provide it as "../lib/_angular/core/index.js", so it loses its original reference.
Of course there are some "workarounds" for this, as renaming folders/files manually before embedding. But, sure? It's not the 21st century way of doing things, specially when we talk about Microsoft technologies, right?
Thanks in advance for any help with this.
@luanmm unfortunately I don't know what we can do with this bug here. It's a fundamental limitation of how embedded resources work in .NET. It could just be that the Embedded File Provider isn't the right solution for the issue you describe because it doesn't support certain path characters.
I managed to solve this issue, but it wasn't trivial. Had to write a tool to run at build time to generate a mappings file, which maps the embedded resource names of the files to be embedded, with their relative paths on disk. I then embed this mappings file into the assembly. At runtime, on startup, I grab the embedded mappings file, and set up an InMemoryFileProvider (which I had to write, and made available as a nuget package) to serve each embedded resource on the desired request path.
Like i said, it wasn't trivial but it does work.
I don't see why this couldn't be solved in the tooling if the will to do so was there. The latest vs2017 project system (CPS) could have a project level msbuild option to embed this mapping into the assembly automatically. The embedded file provider could be extended to honour such a mapping if it exists within the assembly it's handed.
I can't assert that this is the best solution.. just a possible one!
I partially agree with you, @Eilon. I don't see that this is a concern just for the "FileSystem team", too, because the resources already come with the wrong names.
But, for sure, it is a problem in the ASP.NET framework, that is not capable of providing a good EmbededFileProvider because the "packaging system" that handles the embedded resources just messes up with file/folder names.
The solution is not trivial, I know, but with all the tools that is being developed for .NET Core there must be a way to get things going right all the way so the EmbededFileProvider may have the original names to handle requests right.
The @dazinator's proposal is one of the possibilities for this, for instance. But it would be great to have an "official" version doing this, designed accordingly to the .NET Core concepts.
I think that this improved resource embedding system could even benefit developers in other scenarios too, not just in web projects.
This is something really useful when we start thinking about modularity in big projects (one of the most difficult things when we think in Microsoft technologies before .NET Core, and one of the most exciting news for me in this new era). If there is no attention to things like these, ASP.NET can't be a good option for who wants to take advantage of new technologies as Angular, npm and so on.
Perhaps if we share this issue with the team responsible for the .NET Core tools they may help us to design a proper solution?
@luanmm yep I agree. By coincidence I was also thinking about the same solution @dazinator mentioned so that could very well work.
Nice, @Eilon! Hope it can be a good solution for this item.
Thanks for your attention here, @Eilon and @dazinator!
Just one thing I forgot to reinforce: the designed solution must consider other special characters too, not the dash only (as stated in this issue title). Characters like "@" was already tested, for instance, and the case is the same.
Maybe it would be helpful to rename the issue to not mislead who will take this backlog item.
Since there's no update here:
NetCoreStack/Hisar#8 can this be used in this bug here?
I have no objections if you use any of the code I have posted here, if that's your question? Use at your own risk ofcourse :-)
I prefer now to use an InMemoryFileProvider and add my embedded resources / assets to it as MemoryStreamFileInfo's to be served on whatever request path I choose (which may or may not be related to the embedded resource path), and fully supports directories. This gives me freedom to do fancy things as well, like allow the same resource to be served on multiple different paths. I also find this provider useful for unit test scenarios.
Just for infos,
In the context of a library where there are no other types of resources and or with culture id ..., here a workaround through msbuild. Not tested with other special characters, but this prevents dash "-" from being translated into underscore. In my case, it also prevents names as SomeName.XX.Extension
from being translated to SomeName.Extension
because XX
is considered as a Culture id, and then also prevents some unwanted culture assemblies to be outputed ...
Note: here all embedded resources are updated but this is not mandatory.
<Target Name="EmbeddModuleAssets" BeforeTargets="BeforeBuild">
...
<EmbeddedResource Update="@(EmbeddedResource)" Condition="'%(EmbeddedResource.Link)' == ''">
<LogicalName>$([System.String]::new('$(MSBuildProjectName)\%(RelativeDir)%(FileName)%(Extension)').Replace('\', '.').Replace('/', '.'))</LogicalName>
</EmbeddedResource>
...
</Target>
<Target Name="UpdateModuleAssets" AfterTargets="CreateManifestResourceNames">
<ItemGroup>
<EmbeddedResource Update="@(EmbeddedResource)" Condition="'%(EmbeddedResource.WithCulture)' == 'true'">
<WithCulture>false</WithCulture>
<Culture></Culture>
</EmbeddedResource>
</ItemGroup>
</Target>
@javiercn - is this basically the thing we were talking about for embedding files w/ a manifest?
This is fixed by af6fa5a.
We've created a new provider that uses a manifest created at compile time. This provider reflects the original path of the embedded files at the time they were embedded on the assembly.
The manifest gets generated by referencing Microsoft.Extensions.FileProviders.Embedded and adding <GenerateEmbeddedFilesManifest>true</GenerateEmbeddedFilesManifest>
to your csproj library.
The manifest gets consumed using ManifestEmbeddedFileProvider instead of the existing EmbeddedFileProvider
@javiercn I am so pleased this was fixed properly in the framework via a manifest.
and adding true to your csproj library.
Can you clarify that bit - is there some new msbuild property?
Iโve corrected the comment. It contained malformed markdown