“what are the advantages of using a configuration object for function parameters?”
Summary of results
I see a benefit from having this options container: you can have a central place to set the options and then only pass down the decoder and the user of the function doesn't have to bother with the configuration
Keyword arguments or named parameters solve this. In JS I tend to pass a single object to a function with a large number of parameters like this. But I do agree using named entities like enums or constants is also good. A bit heavier but if you do it everywhere the cost is worth it.
When used in conjunction with named parameters with default values, it makes it possible to write function signatures that are significantly more readable than those written in Java, and with far less boilerplate.
There's nothing anti-functional in bundling related functions into an object.
On the contrary, this helps with modularity, which is a good thing. You can then swap out the object for another object with the same interface, but where the functions (methods) have different implementations.
Either copy/paste or rolling them into config objects and passing those down is generally preferable. Copy paste doesn’t always feel great for pass through arguments but it’s perfectly interpretable.
Naked kwargs is so difficult to work with that I hesitate to think of a use case where it wouldn’t be an anti pattern.
The main benefit is you can have configuration options without having to specify all values, and also have non-zero-value defaults. Lets say you had something like Sarama's config struct which contains 50 or so config knobs. The following is will lead to some terrible defaults:
NewConsumer("kafka:9043", Config{ClientID: "foo"})
Here, with this config, there is a config option `MaxMessageBytes` which will be set to 0, which will reject all your messages. What Sarama does is, you can pass a `nil` config which will load defaults, or:
conf := sarama.NewConfig();
conf.ClientID = "foo"
conf.RackID = "bar"
NewConsumer("kafka:9043", conf)
and so on. This is ok but can be cumbersome, especially if you just need to change one or 2 options or if some config options need to be initialized. Also someone can still do &Config{...} and shoot themselves in the foot. The functional options style is more concise.
NewConsumer("kafka:9034", WithClientID("foo"), WithRackID("bar"))
I used to be a fan of this style, and I even have an ORM built around this style (ex. Query(WithID(4), WithFriends(), WithGroup(4))), but I think for options like these a Builder pattern is actually better if your intention is clarity.
In Java or C++ or other OOPish languages, say, you might make all your classes be Configurable (unless they are Configuration), which means their constructors would take a Configuration in some way, possibly with an explicit Configuration argument or with a Configurable argument whose configuration to copy. This way all your objects will know how to find configuration information.
One of the advantages of functions is that a well-named function is self-documenting. If you can take a bunch of lines and wrap them in a function whose name summarizes exactly what it does, then you have improved readability in my opinion. In this example, I don't really need to know the details of how the query parameters are extracted. I just want to know I've got them.
I code in this style. I can tell a lot of things started going right when I adopted the style too. Instead of global variables I have a single config which I initially pass around to functions, then I refine and pass only the data each function needs. The config generally doesn't get mutated. Testing is easier. Functions are easy to reason about. I keep functions to a dozen lines or so, but not to the same neurotic level uncle bob presribes. Likewise a few extra arguments are fine, but too many usually signals a new type is required or time to compose a new function which utilizes the function whos parameters are growing.
You don't suggest any solution. Do you want more function overloading or maybe config objects?
Adding default parameters works well with existing code. It is not bad and lazy because it is easy.
Caller provided objects are a standard idiom that offers greater flexibility to use static/global vars, objects with a FAM, and custom allocators.
I think the idea is that the constructor is for dependencies not parameters, things that are REQUIRED to be set at construction time for the sanity of the object. Anything that is optional can be set in the normal way after construction eg using Object.assign
If you have a situation where an object with many parameters must be configured with a certain (yet different) subset of parameters in certain scenarios, then you probably want a factory.
> Either copy/paste or rolling them into config objects and passing those down is generally preferable
Preferable for whom? I do not prefer it. I much prefer to avoid the extra work it creates for me vs. the simplicity of kwargs. I use explicit args for the function I made and then add *kwargs on the end and then I don't have to write bespoke config objects or copy and paste a bunch of stuff that might be obsolete by a future update to some library and also pollute my function's signature. I would very much welcome a way to tell callers where kwargs is going without having to do extra work.
It's funny how little developers think about how to do configuration right.
It's just a bunch of keys and values, stored in some file, or generated by some code.
But its actually the whole ball game. It's what programming is.
Everything is configuration. Every function parameter is a kind of configuration. And all the configuration in external files inevitably ends up as a function parameter in some way.
The problem is the plain-text representation of code.
Declarative configuration files seem nice because you can see everything in one place.
If you do your configuration programmatically, it is hard to find the correct place to change something.
If our code ran in real-time to show us a representation of the final configuration, and we could trace how each final configuration value was generated, then it wouldn't be a problem.
But no systems are designed with this capability, even though it is quite trivial to do. Configuration is always an after-thought.
Now extend this concept to all of programming. Imagine being able to see every piece of code that depends upon a single configuration value, and any transformations of it.
Also, most configuration is probably better placed into a central database because it is relational/graph-like. Different configuration values relate to one another. So we should be looking at configuration in a database/graph editor.
Once you unchain yourself from plain-text, things start to become a lot simpler...of course the language capabilities I mentioned above still need to become a thing.
Tangential but coding in JS/TS, I often will go for object arguments to make things more readable. If you have a function like:
foo(arg1: boolean, arg2: boolean, arg3: boolean)
Then when you call it it will look like foo(true, false, true) which is not great for readability. Instead I move all the arguments into an object which makes each field explicit. Ie.
foo({ arg1: true, arg2: false, arg3: true })
This also carries the advantage that you can quickly add/remove arguments without going through an arduous refactor.
- If your configuration has more than 5-10 options then env vars become a mess while a configuration file is more maintainable
- Nested configuration / scoping is a mature advantage of configuration files
- You can reload configuration files whereas you can't reload environment variables during runtime
- A configuration file is a transparent record of your configuration and easier to backup and restore than env variables. Env variables can be loaded from many different locations depending on your os.
- In configuration files you can have typed values that are parsed automatically with env variables you need to parse them yourself. This is just a difference not that bad for env variables per se.
Yes, but you can achieve that just by having a normal function that takes some configuration parameters. The documentation of the library suggests that efficiency was a significant concern, and I assume that’s why the implementation is not as straightforward as it otherwise might be.
The bottom line seems to be that such ad-hoc configuration mechanism should be considered to be part of the public interface, which requires managing backwards compatibility when changes become necessary. Developers usually don't like doing that and are probably only used to do this with function signatures and object structures, not with source code files. Under that lens, it should become obvious how bizarre it is to expect configuration to happen that way.
What could an alternative be? First of all, considering user-friendly configuration as a first-class feature and explicitly thinking about the upgrade path. Developing a configuration file format and most importantly having a backwards compatibility policy for that seems the obvious solution.
The "convention over configuration" can be regarded as self-referential. These guidelines are often good default choices in situation where you don't have a strong reason to go either way. You wouldn't write a program which is configurable between those choices (e.g. exhibits more code repetition or less based on a run-time setting): so you go with a good default convention. If you can avoid repeating yourself, that's usually good; machines should do the repetitive work rather than people. If you know that two or more repetitions of something are only initially that way and soon going to diverge, then might as well fork those copies now. Or maybe do allow yourself to repeat yourself, but via macro. Some compiler optimizations violate DRY by design: function inlining, and loop unrolling. It's invisible to humans who aren't disassembling the output, or measuring code size changes.
Functions are a nice to have, but:
- It tends to make things less declarative.
- You lose locality of behavior, which is very useful in configuration.
Also, nickel doesn't support injecting data into the nickel file, so external program can't set variables, query a database and pass the result to the conf file, etc.
I think because there are some desirable traits for config languages that don't exist in general purpose languages.
Not exhaustive list but generally:
Usually constrained to reduce complexity and try to eliminate the need for testing config.
Interopable between many different programming languages.
Readable by programmers working in different languages.
I think this usually makes config languages favour declarative over imperative which usually eliminates most general purpose languages.
Another topic is why do we use configuration at all and what is the difference between code and config.
Also it’s easy enough to pass objects with named properties and destructure the object in the function parameters. So there’s no real need for named parameters anymore and it would just add another duplicative way of writing the same code.
> [In Chapel] there’s the config keyword. If you write config var n=1, the compiler will automatically add a --n flag to the binary. As someone who 1) loves having configurable program, and 2) hates wrangling CLI libraries, a quick-and-dirty way to add single-variable flags seems like an obvious win.
Letting people define configurable variables at their call site is incredibly valuable, even if you don't have compile-time support, and even if you're working on something not meant to be an isolated binary.
At my startup, one our most beloved innovations is that you can write `resolve_config("foo", default="bar", request=request)` pretty much anywhere you'd normally hardcode a value or feature flag... and that's it.
The first time it's seen in any environment, it thread-safely inserts-if-not-present the default value into a key-value storage that's periodically replicated into in-memory dictionaries that live on each of our app servers. Any subsequent time it's accessed, it's a synchronous key-value lookup in memory, with barely any overhead. But we can also configure it in a UI without needing a code redeploy, and have feature flags and overrides set on a per-user or per-tenant basis.
Sometimes, you don't need language support if you have some clever distributed-systems thinking :)
I like tools that have 1:1 mappings for config file and command line flags because, as you say, both have advantages and disadvantages.
Not sure if I would call hardcoded settings in the sourcecode configuration - they are just constants. The primary benefit of configuration is the ability of changing the behavior of software without having to rebuild, redeploy, redistribute or even restart. Either by the user or by the developer or sysadmin, depending on context.
And the major advantage that object data by being coupled with control block reduces fragmentation, and avoids the error of wrongly handling memory leaks when new obj() fails as parameter to shared pointer construction.
you have to instantiate variables anyway. grouping those variables in objects simply makes your code more understandable - there is really no extra cost.
When they're function parameters, you can write them as regular function declarations, btw. ;) Might hurt your memorization efforts though.
Is that really such a common case? Obviously it depends what you’re configuring, but I definitely would not expect that it’s typically OK to jump in and modify the configuration of an object that’s already in use. What if it does do some expensive one-off setup using the supplied config at object creation time?
If you really need a general scoped override, it could be done in the config struct approach just by copying and restoring the entire config. This might be expensive if the struct is big, but on the other hand, you could change multiple properties at once which doesn’t look possible in the function-based approach.
I generally prefer keeping all the configuration in as few languages as possible and preferably in a single language. Adding filesystem-based config where a config option object in the main language of javascript would suffice goes against that.
Also, given a filesystem config, now I'm forced to have many very small files around for each route where each file is most likely just a call to another service handler. I'd prefer to mash most into bigger files that handle related but distinct routes.
Less important, but comes up, it's nice to be able to match routes based on code and not just string equality ... e.g. everyone seems to like having routes for usernames start with '@'
JavaScript has named parameters, in sort of the same way that C has named parameters.
Get used to writing your functions using objects for the arguments:
function myFunc({ foo, bar }) {}
Then you can call
function myFunc({ foo: 1, bar: "x" })
Similarly in C/C++
struct MyFuncArgs { int foo; char bar; };
void myFunc(MyFuncArgs args) {}
myFunc({ .foo = 1, .bar = 'x' });
I think the way it's done is correct, an object passed as a parameter with key value pairs for attributes seems a lot more logical.
Can you explain how having a global variable is more performant than passing a pointer to an object as a function argument in practice?
I prefer Nim's approach where you can have objects, but they're just variables (properties). The procedures aren't tied to objects, but you can pass/return objects. To me this is more flexible.
I can write OO code as well, this is just personal preference.
There's one advantage of foo.bar calling a function, it works well with referential transparency. Whether it's a property/value or function, only the result matters. I can't say it's a big difference though, I've only been mildly annoyed by having to change all call-sites when changing between them. Other languages allow code bodies for getters/setters (foo.bar = ...) so it still hides the call. For a C-style syntax language having the () seems less surprising.
And allows to create any number of objects of its kind without explicitly needing to allocate memory for the objects. That is a major advantage I think, without that the code will get cluttered with all the memory allocation/deallocation going on explicitly.
Statically typing your parameters prevents a lot of instances of passing them in in the wrong order. And if you make a habit of using a single statically typed options parameter instead of multiple parameters that have the same type, you can make passing parameters in the wrong order completely impossible in your codebase.
As an addendum if the function only needs the keys I would possibly just have the parameter be a string[] that expected the user to call object.Keys to pass to.
That way the function isn't asking for parameters it doesn't really care about.
Though I do get the appeal of having the function call object.Keys if it's called frequently so as not to have to sprinkle that call everywhere.
Preferably written in assembler, to avoid the extra complexity of a compiler, right?
Configuration files have been a common feature of software since OSs exist, basically. They serve a clear and useful purpose, even though they create some problems of their own.
Or you could just take the two parameters you're actually using on your function. No new type, no need to pass your mega-object, just take two nice strongly typed arguments.
> discussions have been trending back to just using configuration structs, rather than any of the other fancy options proposed over the years.
It may look a little more fat, and probably copies some fields that will end up in configuration but...
1. Go is very adept at copying large structures
2. A fully scaffolded struct is far easier to read than something hidden inside a function somewhere
Why would you need typing in a configuration file? I would think a configuration file would be specific to your program and any interpretation of data would be handled by your program.
I think we're off on a tangent here, but I will at least agree that named parameters are a godsend in any programming language. Things are so much easier to read when the caller can clearly state their intentions for things like "foo(true, false, true)".
Really, it’s because of the habit of Ruby programs to pass a single options (params, args, config) object in to avoid complex configuration.
Really, what’s needed is separation of concerns, instead of a single “do stuff” function that takes an “everything” argument.
But what’s really wanted is global variables with everything is a single scope.
So in someways, the best generalization like this I already commonly see is a capabilities system, which can be done essentially via function parameters. I've used this as an OOP pattern once or twice and I think it's the easiest way to explain it:
Imagine you have 2 singleton objects in your 'void main()', redkey and bluekey of types Redkey and Bluekey.
Somewhere else, you declare your function: 'int foo(Redkey redkey, int x, int y)' that needs a Redkey object of which only one exists: in your main.
This by itself forces every call on the path between 'main()' and a call to 'foo()' to also include a parameter Redkey.
In the extreme cases where a pattern like this is useful (tracing IO calls, you named it) it can really help cut through a codebase after an hour of refactoring. But it can be limiting. Async and Checked Exceptions are probably the most colored functions and they both need escape hatches because of that.
When JavaScript added hash/object deconstruction (both at the argument level and assigning variables) I noticed code has been using Dict-like function arguments everywhere. It makes typing them a bit more of a pain in the ass (especially without default arguments).
I haven’t decided if I like it better than just breaking up objects into arguments in a more simple functional style.
On one hand it’s more predictable but on the other most complex apps start passing around objects for everything. Typescript of course helps with that, as does nearly modularized code (ie not passing in full typed objects outside of the parent module which owns/generates them unless they uniquely operate on the full object).
These are the small rescissions you end up making a hundred times.
Named parameters are easy enough to mimic in JS using objects and the spread operation
function foo({bar, baz}) { ... }
foo({bar: 1, baz: false})
You can get an associative array with one more parameter, so that's what I do. Array access is nicer than object access to me, but maybe that's just me.
Why not put it in code? You're program in strong typed lang, you write the config in that lang. You have a config function (analogous to the file) and a config "data" type (returned by the function, specified by the "configuree"). The function can only read env vars (keys, secrets, etc) and return the data structure layout by the app, strong typed enough you can prevent any side-effect easily by restricting the return type of the config function.
Your IDE can help you write only acceptable config files (functions) this way.
In case you do not want to recompile for conf file changes, many languages come with some kind of interpreter. You may even make it hot-reloadable for some properties.
You need more, you probably need a config service, which you can build in a type safe fashion as well.
The problem with using a general-purpose programming language for configuration is that you lose the ability to statically interpret it. Maybe one solution is to make sure the configuration SDK is fully side effect free, so that it's always easy to run the configuration with fixed inputs and get a deterministic output.
The second one is impractical because the number of classes you'd need would increase exponentially with each added flag. The first one clashes with
> Prevent over-configurability,
because you're accepting an infinite range of possible callback methods instead of asking a simple yes/no question.
I don't get why this rule even exists, actually. If you take three boolean arguments, that's not okay, but if you wrap them all into a configuration object and pass that instead, that's suddenly... okay?
Yeah my experience is that, like you said, it's good for configuration, and beside this it only generates utils it has seen before.
The thing is I don't need to rewrite a function I've seen before, I'll use a library or framework that anyways offers all of that.
> The disadvantage with making the parameter class a member is that the code gets littered with indirection like 'data->param1' all over the place. This is much nicer.
Isn't that a rather trivial concern? Implementation inheritance is not "much nicer" than delegation, the opposite in fact is the case. It adds a dispatch step to all calls to virtual methods (including calls that are private to implementations at any level of your 'hierarchy') that is not what you would want in most cases. Which in turn means your entire class hierarchy has to be analyzed as a single, highly-coupled program module; it's quite literally impossible to understand portions of it in isolation.
> They lay operational traps everywhere in their quest to get things done fast (like directly embedding config data in code to avoid fixing the config format).
Specific to your example, rather than the sentiment, I think embedding config in code is highly valuable when you don't have a lengthy deployment cycle and have direct access to the source. It gives your compiler more information which can help prevent bad configurations (which are the cause of failures more often than anything else from what I've seen). Developers also have a better shot at feeling how badly configurable a particular component is when the configuration is code. It's much easier to hide the overly-configurable systems under a rug when the configuration is far away from implementation, in a DSL, in a different repository, or only visible at deployment.
The problem with environment variables as configuration is that it's unstructured, hardly documented, and overall hard to reproduce and inspect.
Nothing is worse than trying to understand an issue with a program that heavily relies on environment variables for configuration, as environment variables are designed to be short-lived, memory only.
A good old configuration file is the best. You can version it, distribute it, it's explicit, possibly structured, easily documented.
That doesn't mean that there is no room for environment variables, but these should be for local-only hacks and tweaks.
I'm not sure the opposite of "configuration" should be called "convention" - the worst abuses I've seen of punting to user configuration have been ones where the best solution was only determinable at runtime. (e.g. compare user-configured fixed window sizes in the doomed ISO protocol stack with dynamic window control in TCP)
Typically the user doesn't know best - they copied a config that worked for someone else, years ago on a different machine and workload, and don't know what any of the parameters actually mean. In the worst case (sendmail?) you have O(0) people who actually know how to use the configuration language, and 10 competing higher-level config generators.
what you need is really just Data structure + Algorithm
And Context (at least in my humble experience). Lately I got rid of OOP too and now all my functions accept `params` as a first argument which contains settings and the state of an object/subserver/etc. I find the ability to group functions into different files useful though (it’s js so I cannot do that like one does in c++).
Passing other parameters into the composed functions.
You can use objects to imitate named parameters in JS/TS. I think that's widely used convention and most of my APIs use one object parameter instead of multiple separate parameters exactly because named parameters are awesome. With TS it looks clunky with all those type declarations, but I can live with that.
As to your question, I share this feeling. Naming parameters must be standard feature in every language. Absolute majority of functions would benefit from verbose calling syntax.
I've actually started preferring the `action(subject, object)` form of programming which OO in C entails, rather than `subject.action(object)`. The latter is certainly easier for discovery via auto-complete (with most current tooling).
OO is not a natural way of programming. Everything always starts simply with functions that take arguments. Then you have a function that calls another function with some of the old arguments and some new ones.
Most people go: let's make the shared param an instance var.
const config = {}
foo(config)
function foo(config) { bar('bar', config) }
bar(str, config) { ... }
becomes...
class MyClass {
config
constructor(config) { this.config = config }
foo() { this.bar('bar') }
bar(str) { ... }
}
The problem with a class, is that every method of the class potentially depends on every other member of the class. What usually happens is that stuff is added to the class that doesn't make sense. And every class needs a noun to name it. Then you have to think what is the proper name for this abstract thing you don't even know what it is yet. Which leads to all these quirky class names that are unnecessary.
If you are explicit about what data dependencies each function has, it becomes easier to see the commonality that should be extracted into classes. Most people just shove everything in a class too soon. And most languages push you to use classes and methods...which usually look very different to how functions are represented.
Polymorphism is doable in plain old C with lookup tables and function pointers. If that is the only benefit, what is the point of creating a language where everything is an object?
The two main advantages of using classes are:
a class is also a type (if you have a typed language),
a class is also a module, you group together values that makes sense together (the x and the y of a point) with the functions (methods) that interact with it.
When you have a lot of codes flying around, you can add encapsulation (private/public thingy) so the user of a code does not see the implementation which helps to create libraries that can evolve independently from the applications using them.
Also compared to an associative array, a class is more compact in memory (granted JavaScript runtimes see dictionaries as hidden classes).
Just curious, but from a pragmatic view, what is the advantage of using anonymous functions instead of named functions for any mildly complex code?
Oh that's nothing. In Java I came up with a way to pass config options using method references as keys, so you can write something like:
createFoo(with(FooOptions::enable, true), with(FooOptions::size, 7));
And processing the options involves serialising each method reference to work out what it is!
"you better have the configuration defined in code" - you better do that anyway for anything that is used and will be around for a while :)
In my own applications, I've moved away from configuration files as much as possible. Instead, I provide an API and all my configuration is just code.
I think creating an entirely new Turing-complete language just for configuration is a waste of brain cells for everyone involved.
In short: the power of declarative configuration management. Way less error-prone than imperative shell scripts.
After having grokked Lisp, I think that parentheses provide a more elegant solution to the problem of parameter separation. Compare:
myfunction(a, b, c)
with
(myfunction a b c)
Parentheses allow you to leave out the commas between arguments. I like that a lot.
Why not store the configuration in a dedicated configuration service and let the app fetch them from this configuration server without the need to go through the environment variables layer? Wouldn't it simplify a use case where you have both web and mobile app and have them both fetch configuration in the same way?
As a user, I always prefer the bespoke configuration file format, provided it has comments explaining what each configuration option does.
The problem with using a general programming language for configuration is you end up needing to use that config in different places, in different contexts, and from different languages. So, you have to marshall out and marshall in that structure to some intermediate format.
You want it to be easy for humans to edit and grok, so you find a way to represent the core parts you care about as text and cordone off the general programming language to another area.
You're successful and the number of use cases you cover grows, so the size of that config grows.
And before you know it, you've invented YAML.
Whereas, if you use Cue instead of YAML it looks pretty similar - in fact some large subset of it will be parsed correctly by a YAML parser. But the difference is with Cue you can:
1. Validate the structures in your config
2. Deduplicate by referencing other values in your config (something you can't do in JSON/YAML).
3. Use language built-ins to reduce boiler plate and repetitive text.
except it's not just for function arguments/parameters. It can be used in all sorts of contexts to spread out an array or object. E.g.
```js
const object1 = {
field1: true,
field2: 42
};
const object2 = {
...object1,
field3: 'the meaning of life, the universe, and everything'
};
```
This seems pretty iffy for introducing static/persistent variables to a function. I mean, it can work, but it's semantically very confusing. Parameters are part of a function's interface. A global variable would be much better.
Fair. There are all those "configuration management" frameworks that make templating out config files from data a fairly solved problem, but if you're launching programs from other programs that you also wrote, I guess it's less code to have an arbitrarily long arguments list vs. also dealing with files.
All of the things you said apply equally well to function parameters.
If the method takes so many arguments that named parameters is necessary for readability, it is often a sign that the method is either too complex and should be split up, or that the method should accept a "config" record as a parameter. I this makes it clearer because you can group the parameters and give them meaning. Then you know that "url", "headers" and "body" belongs to the "request" record, while "customerId" and "taskId" are separate
Using globals is simpler, it's also pretty natural in event driven architectures. Passing everything via function arguments is welcome for library code, but there's little point to using it in application code. It just complicates things.