API Additions and the clr


#1

I would like to propose an api addition to the String class, basically adding String.StartsWith(char) and String.EndsWith(char). But this would mean a breaking change to an edge case with vb.net.

So what are the options here? Can Corefx have additional functionality that would actually be part of the clr or is String part of the surface area contract that all the implementations must expose and be functionally equal?


#2

Does String.StartsWith( Char.ToString(‘a’)) not do what you require (C# implementation). I think also String.StartsWith(System.Convert.ToString(‘a’));


#3

I’d have thought that an extension method would work fine for this. On the other hand, what is the edge case that would make it a breaking change?


#4

@AndyW2 Sure there are multiple ways to achieve the result. For performance code though you don’t want the extra string allocation and checking the first char is almost 5 times faster than StartsWith(“a”, StringComparison.Ordinal)

@jammycakes Most of the comparison methods on the String class could be extension methods. There is already Compare(char) and IndexOf(char), StartsWith(char) and EndsWith(char) are good additions I think.

The breaking change would be under the following set of conditions:

  • Language is VB.net
  • Option Strict is off
  • A char is passed as the parameter to StartsWith(string) (Culture based compare. Implicit conversion will compile this as Conversions.ToString(char))
  • The char is one that has other chars that will are considered culturally equal (eg ‘"’)
  • The string being searched starts with a matching but different char (eg “\u204D”)
  • The application is recompiled (new ordinal based char overload is called instead of the implicit conversion)

But my question is really how can these types of api additions be handled by CoreClr/CoreFx. /cc @davkean


#5

I think it is important to consider ‘separation of concerns’ such that Strings work with Strings and Chars work with Chars. It would likely be a violation of this rule to have Strings accepting chars as direct input elements - which is why the current mechanism requires the use of both.

We also need to remember that good programing practices allow us to combine basic building bocks into more complex functionality using ‘loose coupling’. If we provide a method in a string class to manipulate chars then we are creating a dependency between the two concepts (string and char) which is a form of tight coupling - this leads in the long-run to architecture hardening which can be difficult to undo.

Extension methods can be useful to solve repetitive complex problems, however, when they are used for just providing a wrapper around simple existing functionality (in this case, saving the programmer the effort of converting a char to a string), then we need to look at readability and maintainability of the resulting application. What wrapper functions do is hide the intent of a piece of code from the reader who may not be the person that originally wrote the code. This increases what we call cognitive complexity of the code which can lead to a higher risk of introducing defects (by inference). Cognitive complexity is one of those things that are sometimes a necessity but a thing we need to keep to a minimum.

Finally, when referring to performance I think it is important to look beyond the high level language (C#, VB) and understand what the compiler is doing at the ILSM level and also what the CPU is doing at the assembler op-code level. Both are exceptionally capable of understanding the string/char pattern intents signified by the code and optimising the implementations for maximum performance.


#6

To be honest I think that’s probably taking the concept of separation of concerns a bit too far. By a similar argument, methods such as DateTime.AddDays() would be a bad practice.


#7

Does that not expand out to DateTime.AddDays(abc) which is roughly the same format as the String example above. For example, abc could be Int32.parse(‘123’)

You’ll find that when you actually expand out most of the classes they generally take the same form.

C# is an ECMA language, so we remember that the int type refers to System.Int32, hence its really DateTime.AddDays(System.Int32), similar to String.StartsWith(System.Char) (both char and int32 are structs defined in mscorlib)


.NET Foundation Website | Blog | Projects | Code of Conduct