Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update and secure references #20

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions content/articles/2015/10/distributed-ad-hoc.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ Distributed
The `distributed` project prototype provides distributed computing on a cluster
in pure Python.

* [docs](http://distributed.readthedocs.org/en/latest/),
[source](http://github.com/mrocklin/distributed/),
* [docs](https://distributed.dask.org/en/latest/),
[source](https://github.com/mrocklin/distributed/),
[chat](https://gitter.im/mrocklin/distributed)

concurrent.futures interface
Expand Down Expand Up @@ -109,8 +109,8 @@ As an example we perform a binary tree reduction on a sequence of random
arrays.

This is the kind of algorithm you would find hard-coded into a library like
[Spark](http://spark.apache.org/) or
[dask.array](http://dask.pydata.org/en/latest/array.html)/[dask.dataframe](http://dask.pydata.org/en/latest/dataframe.html)
[Spark](https://spark.apache.org/) or
[dask.array](https://docs.dask.org/en/latest/array.html)/[dask.dataframe](https://docs.dask.org/en/latest/dataframe.html)
but that we can accomplish by hand with some for loops while still using
parallel distributed computing. The difference here is that we're not limited
to the algorithms chosen for us and can screw around more freely.
Expand Down Expand Up @@ -156,15 +156,15 @@ Notes
-----

Various other Python frameworks provide distributed function evaluation. A few
are listed [here](http://distributed.readthedocs.org/en/latest/related-work.html)
are listed [here](https://distributed.dask.org/en/latest/related-work.html)
. Notably we're stepping on the toes of
[SCOOP](http://scoop.readthedocs.org/en/0.7/), an excellent library that also
[SCOOP](https://scoop.readthedocs.org/en/0.7/), an excellent library that also
provides a distributed `concurrent.futures` interface.

The `distributed` project could use a more distinct name. Any suggestions?

For more information see the following links:

* [Documentation](http://distributed.readthedocs.org/en/latest/)
* [Source on Github](http://github.com/mrocklin/distributed/)
* [Documentation](https://distributed.dask.org/en/latest/)
* [Source on Github](https://github.com/mrocklin/distributed/)
* [Gitter chat](https://gitter.im/mrocklin/distributed)
8 changes: 4 additions & 4 deletions content/articles/2015/10/distributed-hdfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ We put a dataset on HDFS instance through the command line interface:
Then we query the namenode to discover how it sharded this file.

To avoid JVM dependence we use Spotify's
[snakebite](http://snakebite.readthedocs.org/en/latest/) library which
[snakebite](https://snakebite.readthedocs.org/en/latest/) library which
includes the protobuf headers necessary to interact with the namenode directly,
without using the Java HDFS client library.

Expand Down Expand Up @@ -159,8 +159,8 @@ think about remote hosts that have files on their local file systems. HDFS has
played its part and can exit the stage.

*Note: since writing this we've found a
[wonderful article](http://jvns.ca/blog/2014/05/15/diving-into-hdfs/) by
[Julia Evans](http://jvns.ca/), that describes a similar process.*
[wonderful article](https://jvns.ca/blog/2014/05/15/diving-into-hdfs/) by
[Julia Evans](https://jvns.ca/), that describes a similar process.*


Data-local tasks with distributed
Expand Down Expand Up @@ -198,7 +198,7 @@ Or alternatively we've wrapped up both steps into a little convenience function:
```

As a reminder from
[last time](http://blaze.pydata.org/blog/2015/10/27/distributed-ad-hoc/) these
[last time](https://blaze.pydata.org/blog/2015/10/27/distributed-ad-hoc/) these
operations produce `Future` objects that point to remote results on the worker
computers. This does not pull results back to local memory. We can use these
futures in future computations with the executor.
Expand Down
12 changes: 6 additions & 6 deletions content/articles/2016/02/dask-distributed-1.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ cluster.

We provision nine `m3.2xlarge` nodes on EC2. These have eight cores and 30GB
of RAM each. On this cluster we provision one scheduler and nine workers (see
[setup docs](http://distributed.readthedocs.org/en/latest/setup.html)). (More
[setup docs](https://distributed.dask.org/en/latest/setup.html)). (More
on launching in later posts.) We have five months of data, from 2015-01-01 to
2015-05-31 on the `githubarchive-data` bucket in S3. This data is publicly
avaialble if you want to play with it on EC2. You can download the full
Expand Down Expand Up @@ -105,7 +105,7 @@ records = e.persist(records)

The data lives in S3 in hourly files as gzipped encoded, line delimited JSON.
The `s3.read_text` and `text.map` functions produce
[dask.bag](http://dask.pydata.org/en/latest/bag.html) objects which track our
[dask.bag](https://docs.dask.org/en/latest/bag.html) objects which track our
operations in a lazily built task graph. When we ask the executor to `persist`
this collection we ship those tasks off to the scheduler to run on all of the
workers in parallel. The `persist` function gives us back another `dask.bag`
Expand Down Expand Up @@ -192,7 +192,7 @@ overhead.
Investigate Jupyter
-------------------

We investigate the activities of [Project Jupyter](http://jupyter.org/). We
We investigate the activities of [Project Jupyter](https://jupyter.org/). We
chose this project because it's sizable and because we understand the players
involved and so can check our accuracy. This will require us to filter our
data to a much smaller subset, then find popular repositories and members.
Expand Down Expand Up @@ -489,10 +489,10 @@ done differently with more time.
Links
-----

* [dask](https://dask.pydata.org/en/latest/), the original project
* [distributed](https://distributed.readthedocs.org/en/latest/), the
* [dask](https://docs.dask.org/en/latest/), the original project
* [distributed](https://distributed.dask.org/en/latest/), the
distributed memory scheduler powering the cluster computing
* [dask.bag](http://dask.pydata.org/en/latest/bag.html), the user API we've
* [dask.bag](https://docs.dask.org/en/latest/bag.html), the user API we've
used in this post.
* This post largely repeats work by [Blake Griffith](https://github.com/cowlicks) in a
[similar post](https://www.continuum.io/content/dask-distributed-and-anaconda-cluster)
Expand Down
14 changes: 7 additions & 7 deletions content/pages/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,17 +30,17 @@ The following characteristics can define a particular Data Processing System:
The goal of the Blaze ecosystem is to simplify data processing for users by providing:

- A common language to describe data that it's independent of the Data Processing System, called
[**datashape**](http://blaze.github.io/pages/projects/datashape).
[**datashape**](https://blaze.pydata.org/pages/projects/datashape).
- A common interface to query data that it's independent of the Data Processing System, called
[**blaze**](http://blaze.github.io/pages/projects/blaze).
[**blaze**](https://blaze.pydata.org/pages/projects/blaze).
- A common utility library to move data from one format or system to another, called
[**odo**](http://blaze.github.io/pages/projects/odo).
- Compressed column stores, called [**bcolz**](http://blaze.github.io/pages/projects/bcolz) and
[**castra**](http://blaze.github.io/pages/projects/castra).
- A parallel computational engine, called [**dask**](http://blaze.github.io/pages/projects/dask).
[**odo**](https://blaze.pydata.org/pages/projects/odo).
- Compressed column stores, called [**bcolz**](https://blaze.pydata.org/pages/projects/bcolz) and
[**castra**](https://blaze.pydata.org/pages/projects/castra).
- A parallel computational engine, called [**dask**](https://blaze.pydata.org/pages/projects/dask).


## Learn more

The project repositories can be found under the [Github Blaze Organization](https://github.com/blaze). Feel free to
reach out to the Blaze Developers through our mailing list, [email protected].
reach out to the Blaze Developers through our mailing list, [email protected].
2 changes: 1 addition & 1 deletion content/pages/projects/blaze.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ Title: Blaze
Project: core
Category: Projects
Subtitle: An interface to query data on different storage systems
Docs: http://blaze.readthedocs.org/en/latest/index.html
Docs: https://blaze.readthedocs.org/en/latest/index.html
2 changes: 1 addition & 1 deletion content/pages/projects/dask.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ Title: Dask
Project: core
Category: Projects
Subtitle: Parallel computing through task scheduling and blocked algorithms
Docs: http://dask.readthedocs.org/en/latest/
Docs: https://docs.dask.org/en/latest/
2 changes: 1 addition & 1 deletion content/pages/projects/datashape.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ Title: Datashape
Project: core
Category: Projects
Subtitle: A data description language
Docs: http://datashape.readthedocs.org/en/latest/
Docs: https://datashape.readthedocs.io/en/latest/
2 changes: 1 addition & 1 deletion content/pages/projects/odo.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ Title: Odo
Project: core
Category: Projects
Subtitle: Data migration between different storage systems
Docs: http://odo.readthedocs.org/en/latest/
Docs: https://odo.readthedocs.io/en/latest/
2 changes: 1 addition & 1 deletion content/pages/talks/ep2015-blaze.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@ Category: Talks
Tags: blaze,dask,odo,datashape
Video: https://www.youtube.com/embed/QKBcnEhkCtk
Site: https://ep2015.europython.eu/conference/talks/scale-your-data-not-your-process-welcome-to-the-blaze-ecosystem
Slides: http://chdoig.github.io/ep2015-blaze/
Slides: https://chdoig.github.io/ep2015-blaze/

2 changes: 1 addition & 1 deletion pelicanconf.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

AUTHOR = u'Blaze Developers'
SITENAME = u'The Blaze Ecosystem'
SITEURL = 'http://blaze.github.io/'
SITEURL = 'https://blaze.github.io/'
TITLE = 'The Blaze Ecosystem'
SUBTITLE = 'Connecting people to data'

Expand Down
8 changes: 4 additions & 4 deletions theme/templates/base.html
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@

<!-- FONT
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
<link href='//fonts.googleapis.com/css?family=Raleway:100,200,300,400,600' rel='stylesheet' type='text/css'>
<link href='http://fonts.googleapis.com/css?family=Droid+Serif:400,700' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/css?family=Raleway:100,200,300,400,600' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/css?family=Droid+Serif:400,700' rel='stylesheet' type='text/css'>

<!-- CSS
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
Expand All @@ -29,7 +29,7 @@
<!-- Scripts
–––––––––––––––––––––––––––––––––––––––––––––––––– -->

<script src="//ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

<!-- Favicon
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
Expand Down Expand Up @@ -72,4 +72,4 @@
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
</body>

</html>
</html>
6 changes: 3 additions & 3 deletions theme/templates/includes/comments.html
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by
Disqus.</a></noscript>
<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
<a href="https://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
<div style="margin-bottom:20px"></div>
</section>
{% endif %}
{% endif %}
6 changes: 3 additions & 3 deletions theme/templates/includes/twitter.html
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{% if TWITTER_USERNAME %}
<div style="text-align:right">
<a href="http://twitter.com/share" class="twitter-share-button" data-count="horizontal" data-via="{{TWITTER_USERNAME}}">Tweet</a>
<a href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal" data-via="{{TWITTER_USERNAME}}">Tweet</a>
</div>
<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>
<script type="text/javascript" src="https://platform.twitter.com/widgets.js"></script>

{% endif %}
{% endif %}