Multithreading doesn't automatically speed up code and in some cases can slow it down. (For instance, if the threads contend over a shared resource, in particular the same cache line, this will not perform well.) Can you say a little more about computation you're doing?
The 12700k does have 12 cores, but it is a 8+4 configuration (meaning 8 performance cores which are also capable of hyperthreading plus 4 efficiency cores) giving you 8 fast threads + 8 hyper threads + 4 "efficiency" threads.
I wouldn't say its impossible that what you see is just the performance available.
Welcome to real world! Why do you think people invested literally insane amount of effort making single-thread performance better? i7-12700K doesn't have 12 cores. Rather it has 8 fast cores and 4 slow cores. Difference between fast cores and slow ones is more than 2x. That means that your program uses multithreading and slow cores for some of these calculations. This would easily explain going from 2s to 3s.
Add the fact that when you only have one active core CPU would pick some “blessed” core which is much faster then others but when all cores are loaded CPU would have to thermal-throttle everything. This means we should expect 4s or maybe even 5s… which is exactly what we observe.