Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I'm implementing the SerializableFunction interface and I'd like to reuse some expensive helper objects that I create in the constructor. When this class is used in a dataflow job, is a new instance created/cloned for every thread that uses it?
Thanks,
Genady
Short Answer
SerializableFunction does not need to be thread-safe since each thread gets its own deserialized instance. Any references which it accesses within a shared scope (e.g. via static methods/static references/...) need to be thread-safe.
Long Answer
The SerializableFunction is serialized using Java's object serialization mechanism and saved as a part of the Dataflow specification. Depending on the specification and how it is optimized, the SerializableFunction will most likely be broken up into multiple units of work. Each worker machine may then request 1 or more units of work which they process in parallel. Each unit of work will use Java's object serialization mechanism to recreate an instance of the SerializableFunction. Each thread is assigned to only one unit of work. Note that even though each unit of work is assigned to one thread, if the expensive helper objects are not part of the SerializableFunction and instead accessed via another method such as through a static reference/method, then the expensive helper objects may still be shared amongst multiple instances of the same SerializableFunction on the worker.
Thanks for contributing an answer to Stack Overflow!
-
Please be sure to
answer the question
. Provide details and share your research!
But
avoid
…
-
Asking for help, clarification, or responding to other answers.
-
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our
tips on writing great answers
.