Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
159 commits
Select commit Hold shift + click to select a range
9215d64
added storage object to represent caches and associated readout funct…
tfesenbecker Oct 31, 2019
0a46da7
extended CLI to support storage files
tfesenbecker Oct 31, 2019
e727b69
extended simulator to support storage files
tfesenbecker Oct 31, 2019
d64954d
added new drone attribute sitename connecting drones and storage elem…
tfesenbecker Oct 31, 2019
5597434
added file provider object connecting storage objects and jobs
tfesenbecker Oct 31, 2019
bb1dcbe
added different caching/cache cleaning/walltime recalculation algorithms
tfesenbecker Oct 31, 2019
8d6db96
renamed storage readout
tfesenbecker Oct 31, 2019
fb150db
fixed debug output
tfesenbecker Nov 1, 2019
69072ae
renamed storage input reader
tfesenbecker Nov 1, 2019
53ebec5
updated Job class
tfesenbecker Nov 2, 2019
f997223
replaced function modifying walltime by function with property decorator
tfesenbecker Nov 2, 2019
2e2c06f
Revert "replaced function modifying walltime by function with propert…
tfesenbecker Nov 2, 2019
110b3e9
replaced function modifying walltime by function with property decorator
tfesenbecker Nov 2, 2019
b032a0d
resolving PEP8 issues
tfesenbecker Nov 2, 2019
7753d0d
Merge branch 'master' of https://github.com/MatterMiners/lapis into c…
tfesenbecker Nov 2, 2019
5123034
fixed file provider bug (wrong inputfiles dictionary)
tfesenbecker Nov 2, 2019
1c2fe9f
Update lapis/cli/simulate.py
tfesenbecker Nov 4, 2019
8739ce9
renamed function get_used_storage to _calculate_used_storage
tfesenbecker Nov 4, 2019
0b5a922
Merge branch 'cachingextension' of https://github.com/tfesenbecker/la…
tfesenbecker Nov 4, 2019
855242a
attached fileprovider to drone instead of job and passed it via make_…
tfesenbecker Nov 4, 2019
bfadacb
reworked file coverage function to return a score
tfesenbecker Nov 4, 2019
3f30c58
added proper __repr__ function
tfesenbecker Nov 4, 2019
2b214aa
added file classes
tfesenbecker Nov 7, 2019
2bd91d7
moved caching algorithm and associated cache cleanup to it's own class
tfesenbecker Nov 7, 2019
29576eb
Redesign of the storage class and associated changes
tfesenbecker Nov 7, 2019
146fbe3
put walltime getter and walltime recalculation back in seperate methods
tfesenbecker Nov 7, 2019
7ef8dd9
added parallel treatment of jobs input files in file provider
tfesenbecker Nov 7, 2019
b94ab82
fixed failed unit test that were caused by Drone without file provide…
tfesenbecker Nov 7, 2019
1e9e795
Merge branch 'master' of https://github.com/MatterMiners/lapis into c…
tfesenbecker Nov 7, 2019
191df2b
changed scoring to take filesizes into consideration
tfesenbecker Nov 8, 2019
a635318
Merge branch 'cachingextension' into feature/storageimprovement
tfesenbecker Nov 8, 2019
6f7ace1
Merge pull request #1 from tfesenbecker/feature/storageimprovement
tfesenbecker Nov 8, 2019
75165ad
fixed bug from merge
tfesenbecker Nov 8, 2019
d943ed6
Merge branch 'cachingextension' of https://github.com/tfesenbecker/la…
tfesenbecker Nov 8, 2019
9453632
Merge pull request #2 from tfesenbecker/feature/storageimprovement
tfesenbecker Nov 8, 2019
32faa38
removed debug output to fix unit test
tfesenbecker Nov 8, 2019
78a6f18
First steps towards including everything concerning caching into moni…
tfesenbecker Nov 11, 2019
c7c2e03
renamed method
tfesenbecker Nov 11, 2019
73ada77
split processing of job into file transfer and actual calculation
tfesenbecker Nov 11, 2019
02a79bb
refactored storage and file provider objects in order to use Pipe
tfesenbecker Nov 11, 2019
63664e0
added monitoring for remote and storage connections
tfesenbecker Nov 12, 2019
2df3841
small fix in monitoring
tfesenbecker Nov 12, 2019
a6a9783
Merge branch 'master' of https://github.com/MatterMiners/lapis into f…
tfesenbecker Nov 13, 2019
6e5cdd7
Merge branch 'master' of https://github.com/MatterMiners/lapis into f…
tfesenbecker Nov 14, 2019
cf5b3ab
adapted job walltime to new job processing in order to fix job event …
tfesenbecker Nov 17, 2019
3768f5b
minor clean ups
tfesenbecker Nov 17, 2019
df17230
added cache modelation via cachehitrate
tfesenbecker Nov 17, 2019
128425f
Update lapis/cachealgorithm.py
tfesenbecker Nov 17, 2019
04223e4
Update lapis/storage.py
tfesenbecker Nov 17, 2019
59830d7
Update lapis/storage.py
tfesenbecker Nov 17, 2019
a7b3323
Update lapis/storage.py
tfesenbecker Nov 17, 2019
eff97c4
Update lapis/storage.py
tfesenbecker Nov 17, 2019
ffca7a3
Update lapis/storage.py
tfesenbecker Nov 17, 2019
97e3f83
Update lapis/storage.py
tfesenbecker Nov 17, 2019
a4ceec4
resolved PEP8 issue
tfesenbecker Nov 17, 2019
b2cb120
Merge pull request #61 from tfesenbecker/feature/includepipes
tfesenbecker Nov 19, 2019
cdeeea6
minor fix
tfesenbecker Nov 19, 2019
edfc8f1
moved definition of remote throughput to CLI input, storage object th…
tfesenbecker Nov 20, 2019
25a2a23
Extended cache algorithm documentation
tfesenbecker Nov 20, 2019
b374cf6
implemented minor changes requested in PRs
tfesenbecker Nov 20, 2019
4487db6
Update lapis/storage.py
tfesenbecker Nov 20, 2019
521529f
Merge branch 'master' of https://github.com/MatterMiners/lapis into f…
tfesenbecker Nov 22, 2019
5ebf272
updated usim version requirement to 0.4.2
tfesenbecker Nov 26, 2019
1e814ea
completed renaming of file provider to connection
tfesenbecker Nov 26, 2019
43361c9
fixed job and simulator unit tests
tfesenbecker Nov 27, 2019
d47789f
fixed job and simulator unit tests
tfesenbecker Nov 27, 2019
a443618
replaced Storage.__repr__ to match the other classes
tfesenbecker Nov 27, 2019
cca1859
added missing default values for unit test compatibility
tfesenbecker Nov 27, 2019
36f5966
extended monitoring
tfesenbecker Nov 27, 2019
be5482f
fixed PEP8 issue
tfesenbecker Nov 27, 2019
4a358cf
updated via_usim decorator
eileen-kuehn Nov 27, 2019
d0051c2
added statistics about jobs in DummyScheduler
eileen-kuehn Nov 27, 2019
4cae9be
made unit tests succeed again
eileen-kuehn Nov 27, 2019
19ff945
made enabling of monitoring explicit
eileen-kuehn Nov 27, 2019
719fc97
blackened file
eileen-kuehn Nov 27, 2019
b87064d
changed cli to also start without any caching information
eileen-kuehn Nov 27, 2019
c1ae198
changed assignment of connections a bit
eileen-kuehn Nov 27, 2019
e57cd00
removed creation of connection module from test as it is not required
eileen-kuehn Nov 27, 2019
6df1b27
converted storage and file sizes to bytes
eileen-kuehn Nov 27, 2019
aced023
Merge branch 'master' into feature/caching
eileen-kuehn Nov 27, 2019
000be38
corrected access to numberofaccesses
eileen-kuehn Nov 27, 2019
08c6432
changed signature of StoredFile and adapted in IO operations
eileen-kuehn Nov 27, 2019
883b758
improved storage
eileen-kuehn Nov 27, 2019
df362a5
Merge branch 'master' into feature/caching
eileen-kuehn Nov 27, 2019
d5673b4
minimum required usim version set to 0.4.3
eileen-kuehn Nov 27, 2019
65fd2be
renamed remove_from_storage and add_to_storage to remove and add
eileen-kuehn Nov 27, 2019
325194d
made free_space a property of storage
eileen-kuehn Nov 27, 2019
22e4ea7
removed method find_file from storage
eileen-kuehn Nov 27, 2019
7faf2ee
added todo
eileen-kuehn Nov 27, 2019
e7a21e7
ignored B006 for flake8
eileen-kuehn Nov 27, 2019
a06dc62
if file is available on storage, transfer now receives correct size
eileen-kuehn Nov 27, 2019
3e3752b
fixed position of noqa
eileen-kuehn Nov 27, 2019
a570d50
made determine_inputfile_source private to connection
eileen-kuehn Nov 27, 2019
4ce52df
renamed transfer_inputfiles to transfer_files
eileen-kuehn Nov 27, 2019
6eb4615
removed queue from file lookup in storage and improved determine inpu…
eileen-kuehn Nov 27, 2019
7a7492f
improved stream file in connection
eileen-kuehn Nov 27, 2019
6a8801b
introduced HitrateStorage that transfers data based on a cache hitrate
eileen-kuehn Nov 27, 2019
43577ac
fixed position of noqa
eileen-kuehn Nov 27, 2019
19bde1f
removed cachehitrate from connection
eileen-kuehn Nov 27, 2019
0c2bd3d
connection now sets reference to remote_connection for storage
eileen-kuehn Nov 27, 2019
122a351
storage objects are now created based on specified cache hit rate
eileen-kuehn Nov 27, 2019
23a7d7b
adapted usage of caching for jobs
eileen-kuehn Nov 28, 2019
e2e8f57
introduced calculation efficiency to job
eileen-kuehn Nov 28, 2019
b8e51fb
introduced calculation efficiency for jobs to cli
eileen-kuehn Nov 28, 2019
2056b41
added more type hints for job
eileen-kuehn Nov 28, 2019
ddb5727
removed initialisation of connection
eileen-kuehn Nov 28, 2019
f730a51
moved caching related monitoring to extra file
eileen-kuehn Nov 28, 2019
1bb6b13
each simulation run now can be identified
eileen-kuehn Nov 28, 2019
4c771a0
added caching-specific monitoring information to documentation
eileen-kuehn Nov 28, 2019
250ddae
added type hints for simulator
eileen-kuehn Nov 28, 2019
ddba83b
changed sizes for storage to bytes
eileen-kuehn Nov 28, 2019
f49c4ef
fixed bug leading to full RAM
tfesenbecker Nov 28, 2019
7d92a69
Merge branch 'feature/caching' of https://github.com/MatterMiners/lap…
tfesenbecker Nov 28, 2019
0b3b81e
added RemoteStorage
eileen-kuehn Nov 28, 2019
e2650c0
renamed storagesize to size and ensured correct units
eileen-kuehn Nov 28, 2019
0adb6d0
ensured that size is always int
eileen-kuehn Nov 28, 2019
ac73c82
renamed method again
eileen-kuehn Nov 28, 2019
17b23da
fixed semmle issue
eileen-kuehn Nov 28, 2019
09ba7a4
added type hints
eileen-kuehn Nov 28, 2019
eae5555
removed cachealgorithm from storage and moved to connection
eileen-kuehn Nov 28, 2019
2e6c1dd
fixed bug leading to full RAM again
tfesenbecker Nov 29, 2019
0e3c573
fixed bug leading to full RAM again
tfesenbecker Nov 29, 2019
27c1ea7
Merge branch 'feature/caching' of https://github.com/MatterMiners/lap…
tfesenbecker Nov 29, 2019
2a61f00
fix hit rate based caching functionality
tfesenbecker Nov 29, 2019
4a95f3a
added first test for storage io
eileen-kuehn Nov 29, 2019
0e89302
adapted access to connection for RemoteStorage
eileen-kuehn Nov 29, 2019
5fc8b37
added new test for storage input
eileen-kuehn Nov 29, 2019
d3ff0da
fixed assignment of remote storage
eileen-kuehn Nov 29, 2019
cbc1507
reverted change of transfer signature and added typehints
eileen-kuehn Nov 29, 2019
d0124d1
introduced interface for storage
eileen-kuehn Nov 29, 2019
c7af203
added docstrings to storage interface
eileen-kuehn Nov 29, 2019
0019583
extended tests
eileen-kuehn Nov 29, 2019
01efb93
removed public update method from storage and made update private
eileen-kuehn Nov 29, 2019
a7177f7
added unit conversion for storageelement connection and remote connec…
tfesenbecker Nov 29, 2019
d684dc8
reformated debug output
tfesenbecker Nov 29, 2019
328e6dc
added debug output and fixed wrong function call
tfesenbecker Nov 29, 2019
0e49c32
added debug output and fixed wrong function call
tfesenbecker Nov 29, 2019
98dbba1
Merge branch 'feature/caching' of https://github.com/MatterMiners/lap…
tfesenbecker Nov 29, 2019
93589ba
Merge branch 'master' of https://github.com/MatterMiners/lapis into f…
tfesenbecker Nov 30, 2019
2351f51
reduced debug output, added default value for job_repr
tfesenbecker Dec 3, 2019
9e3a446
Added unittests to test functionality of Connection class with Hitrat…
tfesenbecker Dec 3, 2019
5cdcf35
extended type hints
tfesenbecker Dec 3, 2019
2747bce
fixed error in _usedstorage calculation
tfesenbecker Dec 3, 2019
4f6132f
storage.py now catches exception caused by not specified storage cont…
tfesenbecker Dec 3, 2019
fa0e98f
fixed test_storage unit tests
tfesenbecker Dec 3, 2019
d01ba77
added new unit tests for hitrate based caching
tfesenbecker Dec 3, 2019
5477344
removed forgotten debug output
tfesenbecker Dec 3, 2019
2dc77ff
storage_content_reader() handles empty files correctly now
tfesenbecker Dec 3, 2019
c573ac5
fixed line length
tfesenbecker Dec 3, 2019
f6fec85
fixed line length
tfesenbecker Dec 3, 2019
2fb8afe
Monitor.run now properly closes the sampling aiter
maxfischer2781 Dec 3, 2019
a5d13de
made the linter happy
maxfischer2781 Dec 3, 2019
d277354
added default value for calculation_efficiency
tfesenbecker Dec 3, 2019
a165249
remove calculation_efficiency default value
tfesenbecker Dec 3, 2019
691d523
corrected wrong attribute names
tfesenbecker Dec 3, 2019
b8ae864
fixed usage of transfer() interface and debug outputs
tfesenbecker Dec 3, 2019
73d6352
fixed debug output leading to failing unit test
tfesenbecker Dec 3, 2019
3c58797
fixed bug in filesize unit conversion
tfesenbecker Dec 4, 2019
616a7e1
Extended hitrate based caching to support different cache hitrates fo…
tfesenbecker Jan 20, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .coveragerc
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ exclude_lines =
raise NotImplementedError
return NotImplemented
if __name__ == "__main__"
if __name__ == '__main__'
if __name__ == '__main__'
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -207,4 +207,3 @@ Icon
Network Trash Folder
Temporary Items
.apdisk

2 changes: 1 addition & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ help:
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
6 changes: 3 additions & 3 deletions docs/source/topics/monitoring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,9 @@ COBalD-specific Monitoring
Caching-specific Monitoring
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. TODO::

Will be added as soon as the caching branch is merged.
.. autofunction:: lapis.monitor.caching.storage_status
.. autofunction:: lapis.monitor.caching.storage_connection
.. autofunction:: lapis.monitor.caching.remote_connection

Telegraf
--------
Expand Down
57 changes: 57 additions & 0 deletions lapis/cachealgorithm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from typing import Optional, Callable, Tuple

from lapis.files import RequestedFile, StoredFile
from lapis.storageelement import StorageElement
from lapis.utilities.cache_cleanup_implementations import sort_files_by_cachedsince


def check_size(file: RequestedFile, storage: StorageElement):
return storage.size >= file.filesize


def check_relevance(file: RequestedFile, storage: StorageElement):
return True


def delete_oldest(
file: RequestedFile, storage: StorageElement
) -> Tuple[bool, Tuple[StoredFile]]:
deletable_files = []
currently_free = storage.available
if currently_free < storage.available:
sorted_files = sort_files_by_cachedsince(storage.files.items())
while currently_free < file.filesize:
deletable_files.append(next(sorted_files))
currently_free += deletable_files[-1].filesize
return True, tuple(deletable_files)


def delete_oldest_few_used(
file: RequestedFile, storage: StorageElement
) -> Tuple[bool, Optional[Tuple[StoredFile]]]:
deletable_files = []
currently_free = storage.available
if currently_free < storage.available:
sorted_files = sort_files_by_cachedsince(storage.files.items())
for current_file in sorted_files:
if current_file.numberofaccesses < 3:
deletable_files.append(current_file)
currently_free += deletable_files[-1].filesize
if currently_free >= file.filesize:
return True, tuple(deletable_files)
return False, None


class CacheAlgorithm(object):
def __init__(self, caching_strategy: Callable, deletion_strategy: Callable):
self._caching_strategy = lambda file, storage: check_size(
file, storage
) and check_relevance(file, storage)
self._deletion_strategy = lambda file, storage: delete_oldest(file, storage)

def consider(
self, file: RequestedFile, storage: StorageElement
) -> Tuple[bool, Optional[Tuple[StoredFile]]]:
if self._caching_strategy(file, storage):
return self._deletion_strategy(file, storage)
return False, None
57 changes: 52 additions & 5 deletions lapis/cli/simulate.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from functools import partial

import click
import logging.handlers

Expand All @@ -9,6 +11,8 @@
from lapis.pool import StaticPool, Pool
from lapis.pool_io.htcondor import htcondor_pool_reader
from lapis.job_io.swf import swf_job_reader
from lapis.storageelement import StorageElement, HitrateStorage
from lapis.storage_io.storage import storage_reader

from lapis.scheduler import CondorJobScheduler
from lapis.simulator import Simulator
Expand All @@ -25,18 +29,22 @@

pool_import_mapper = {"htcondor": htcondor_pool_reader}

storage_import_mapper = {"standard": storage_reader}


@click.group()
@click.option("--seed", type=int, default=1234)
@click.option("--until", type=float)
@click.option("--log-tcp", "log_tcp", is_flag=True)
@click.option("--log-file", "log_file", type=click.File("w"))
@click.option("--log-telegraf", "log_telegraf", is_flag=True)
@click.option("--calculation-efficiency", type=float)
@click.pass_context
def cli(ctx, seed, until, log_tcp, log_file, log_telegraf):
def cli(ctx, seed, until, log_tcp, log_file, log_telegraf, calculation_efficiency):
ctx.ensure_object(dict)
ctx.obj["seed"] = seed
ctx.obj["until"] = until
ctx.obj["calculation_efficiency"] = calculation_efficiency
monitoring_logger = logging.getLogger()
monitoring_logger.setLevel(logging.DEBUG)
time_filter = SimulationTimeFilter()
Expand Down Expand Up @@ -71,22 +79,51 @@ def cli(ctx, seed, until, log_tcp, log_file, log_telegraf):
type=(click.File("r"), click.Choice(list(pool_import_mapper.keys()))),
multiple=True,
)
@click.option(
"--storage-files",
"storage_files",
type=(
click.File("r"),
click.File("r"),
click.Choice(list(storage_import_mapper.keys())),
),
default=(None, None, None),
)
@click.option("--remote-throughput", "remote_throughput", type=float, default=10)
@click.option("--cache-hitrate", "cache_hitrate", type=float, default=None)
@click.pass_context
def static(ctx, job_file, pool_file):
def static(ctx, job_file, pool_file, storage_files, remote_throughput, cache_hitrate):
click.echo("starting static environment")
simulator = Simulator(seed=ctx.obj["seed"])
file, file_type = job_file
simulator.create_job_generator(
job_input=file, job_reader=job_import_mapper[file_type]
job_input=file,
job_reader=partial(
job_import_mapper[file_type],
calculation_efficiency=ctx.obj["calculation_efficiency"],
),
)
simulator.create_scheduler(scheduler_type=CondorJobScheduler)

if all(storage_files):
simulator.create_connection_module(remote_throughput * 1024 * 1024 * 1024)
storage_file, storage_content_file, storage_type = storage_files
simulator.create_storage(
storage_input=storage_file,
storage_content_input=storage_content_file,
storage_reader=storage_import_mapper[storage_type],
storage_type=partial(HitrateStorage, cache_hitrate)
if cache_hitrate is not None
else StorageElement,
)
for current_pool in pool_file:
pool_file, pool_file_type = current_pool
simulator.create_pools(
pool_input=pool_file,
pool_reader=pool_import_mapper[pool_file_type],
pool_type=StaticPool,
)
simulator.enable_monitoring()
simulator.run(until=ctx.obj["until"])


Expand All @@ -108,7 +145,11 @@ def dynamic(ctx, job_file, pool_file):
simulator = Simulator(seed=ctx.obj["seed"])
file, file_type = job_file
simulator.create_job_generator(
job_input=file, job_reader=job_import_mapper[file_type]
job_input=file,
job_reader=partial(
job_import_mapper[file_type],
calculation_efficiency=ctx.obj["calculation_efficiency"],
),
)
simulator.create_scheduler(scheduler_type=CondorJobScheduler)
for current_pool in pool_file:
Expand All @@ -119,6 +160,7 @@ def dynamic(ctx, job_file, pool_file):
pool_type=Pool,
controller=SimulatedLinearController,
)
simulator.enable_monitoring()
simulator.run(until=ctx.obj["until"])


Expand Down Expand Up @@ -146,7 +188,11 @@ def hybrid(ctx, job_file, static_pool_file, dynamic_pool_file):
simulator = Simulator(seed=ctx.obj["seed"])
file, file_type = job_file
simulator.create_job_generator(
job_input=file, job_reader=job_import_mapper[file_type]
job_input=file,
job_reader=partial(
job_import_mapper[file_type],
calculation_efficiency=ctx.obj["calculation_efficiency"],
),
)
simulator.create_scheduler(scheduler_type=CondorJobScheduler)
for current_pool in static_pool_file:
Expand All @@ -164,6 +210,7 @@ def hybrid(ctx, job_file, static_pool_file, dynamic_pool_file):
pool_type=Pool,
controller=SimulatedLinearController,
)
simulator.enable_monitoring()
simulator.run(until=ctx.obj["until"])


Expand Down
149 changes: 149 additions & 0 deletions lapis/connection.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
import random

from typing import Union, Optional
from usim import Scope, time, Pipe

from lapis.cachealgorithm import (
CacheAlgorithm,
check_size,
check_relevance,
delete_oldest_few_used,
)
from lapis.storageelement import StorageElement, RemoteStorage
from lapis.files import RequestedFile, RequestedFile_HitrateBased
from lapis.monitor import sampling_required


class Connection(object):

__slots__ = ("storages", "remote_connection", "caching_algorithm")

def __init__(self, throughput=100):
self.storages = dict()
self.remote_connection = RemoteStorage(Pipe(throughput=throughput))
self.caching_algorithm = CacheAlgorithm(
caching_strategy=lambda file, storage: check_size(file, storage)
and check_relevance(file, storage),
deletion_strategy=lambda file, storage: delete_oldest_few_used(
file, storage
),
)

def add_storage_element(self, storage_element: StorageElement):
"""
Register storage element in Connetion module clustering storage elements by
sitename
:param storage_element:
:return:
"""
storage_element.remote_storage = self.remote_connection
try:
self.storages[storage_element.sitename].append(storage_element)
except KeyError:
self.storages[storage_element.sitename] = [storage_element]

async def _determine_inputfile_source(
self,
requested_file: RequestedFile,
dronesite: Optional[str],
job_repr: Optional[str] = None,
) -> Union[StorageElement, RemoteStorage]:
"""
Collects NamedTuples containing the amount of data of the requested file
cached in a storage element and the storage element for all reachable storage
objects on the drone's site. The tuples are sorted by amount of cached data
and the storage object where the biggest part of the file is cached is
returned. If the file is not cached in any storage object the connection module
remote connection is returned.
:param requested_file:
:param dronesite:
:param job_repr:
:return:
"""
provided_storages = self.storages.get(dronesite, None)
if provided_storages is not None:
look_up_list = []
for storage in provided_storages:
look_up_list.append(storage.find(requested_file, job_repr))
storage_list = sorted(
[entry for entry in look_up_list], key=lambda x: x[0], reverse=True
)
for entry in storage_list:
# TODO: check should better check that size is bigger than requested
if entry.cached_filesize > 0:
return entry.storage
return self.remote_connection

async def stream_file(
self, requested_file: RequestedFile, dronesite, job_repr=None
):
"""
Determines which storage object is used to provide the requested file and
startes the files transfer. For files transfered via remote connection a
potential cache decides whether to cache the file and handles the caching
process.
:param requested_file:
:param dronesite:
:param job_repr:
:return:
"""
used_connection = await self._determine_inputfile_source(
requested_file, dronesite, job_repr
)
await sampling_required.put(used_connection)
if used_connection == self.remote_connection and self.storages.get(
dronesite, None
):
try:
potential_cache = random.choice(self.storages[dronesite])
cache_file, files_for_deletion = self.caching_algorithm.consider(
file=requested_file, storage=potential_cache
)
if cache_file:
for file in files_for_deletion:
await potential_cache.remove(file, job_repr)
await potential_cache.add(requested_file, job_repr)
else:
print(
f"APPLY CACHING DECISION: Job {job_repr}, "
f"File {requested_file.filename}: File wasnt "
f"cached @ {time.now}"
)
except KeyError:
pass
print(f"now transfering {requested_file.filesize} from {used_connection}")
await used_connection.transfer(requested_file, job_repr=job_repr)
print(
"Job {}: finished transfering of file {}: {}GB @ {}".format(
job_repr, requested_file.filename, requested_file.filesize, time.now
)
)

async def transfer_files(self, drone, requested_files: dict, job_repr=None):
"""
Converts dict information about requested files to RequestedFile object and
parallely launches streaming for all files
:param drone:
:param requested_files:
:param job_repr:
:return:
"""
start_time = time.now
async with Scope() as scope:
for inputfilename, inputfilespecs in requested_files.items():
if "hitrates" in inputfilespecs.keys():
requested_file = RequestedFile_HitrateBased(
inputfilename,
inputfilespecs["usedsize"],
inputfilespecs["hitrates"],
)
else:
requested_file = RequestedFile(
inputfilename, inputfilespecs["usedsize"]
)
scope.do(self.stream_file(requested_file, drone.sitename, job_repr))
stream_time = time.now - start_time
print(
"STREAMED files {} in {}".format(list(requested_files.keys()), stream_time)
)
return stream_time
Loading